8.8 KiB

Raw Blame History

🧠 AI Lab – Transformers CLI Playground

A pedagogical and technical project designed for AI practitioners and students to experiment with Hugging Face Transformers through an interactive Command‑Line Interface (CLI).
This playground provides ready‑to‑use NLP pipelines (Sentiment Analysis, Named Entity Recognition, Text Generation, Fill‑Mask, Moderation, etc.) in a modular, extensible, and educational codebase.

📚 Overview

The AI Lab – Transformers CLI Playground allows you to explore multiple natural language processing tasks directly from the terminal.
Each task (e.g., sentiment, NER, text generation) is implemented as a Command Module, which interacts with a Pipeline Module built on top of the transformers library.

The lab is intentionally structured to demonstrate clean software design for ML codebases — with strict separation between configuration, pipelines, CLI logic, and display formatting.

🗂️ Project Structure

src/
├── __init__.py
├── main.py                 # CLI entry point
│
├── cli/
│   ├── __init__.py
│   ├── base.py             # CLICommand base class & interactive shell handler
│   └── display.py          # Console formatting utilities (tables, colors, results)
│
├── commands/               # User-facing commands wrapping pipeline logic
│   ├── __init__.py
│   ├── sentiment.py        # Sentiment analysis command
│   ├── fillmask.py         # Masked token prediction command
│   ├── textgen.py          # Text generation command
│   ├── ner.py              # Named Entity Recognition command
│   └── moderation.py       # Toxicity / content moderation command
│
├── pipelines/              # Machine learning logic (Hugging Face Transformers)
│   ├── __init__.py
│   ├── template.py         # Blueprint for creating new pipelines
│   ├── sentiment.py
│   ├── fillmask.py
│   ├── textgen.py
│   ├── ner.py
│   └── moderation.py
│
└── config/
    ├── __init__.py
    └── settings.py         # Global configuration (default models, parameters)

⚙️ Installation

🧾 Option 1 – Using Poetry (Recommended)

Poetry is used as the main dependency manager.

# 1. Create and activate a new virtual environment
poetry shell

# 2. Install dependencies
poetry install

This will automatically install all dependencies declared in pyproject.toml, including transformers and torch.

To run the CLI inside the Poetry environment:

poetry run python src/main.py

📦 Option 2 – Using pip and requirements.txt

If you prefer using requirements.txt manually:

# 1. Create a virtual environment
python -m venv .venv

# 2. Activate it
# Linux/macOS
source .venv/bin/activate
# Windows PowerShell
.venv\Scripts\Activate.ps1

# 3. Install dependencies
pip install -r requirements.txt

▶️ Usage

Once installed, launch the CLI with:

python -m src.main
# or, if using Poetry
poetry run python src/main.py

You’ll see an interactive menu listing the available commands:

Welcome to AI Lab - Transformers CLI Playground
Available commands:
  • sentiment     – Analyze the sentiment of a text
  • fillmask      – Predict masked words in a sentence
  • textgen       – Generate text from a prompt
  • ner           – Extract named entities from text
  • moderation    – Detect toxic or unsafe content

Example Sessions

🔹 Sentiment Analysis

💬 Enter text: I absolutely love this project!
→ Sentiment: POSITIVE (score: 0.998)

🔹 Fill‑Mask

💬 Enter text: The capital of France is [MASK].
→ Predictions:
  1) Paris      score: 0.87
  2) Lyon       score: 0.04
  3) London     score: 0.02

🔹 Text Generation

💬 Prompt: Once upon a time
→ Output: Once upon a time there was a young AI learning to code...

🔹 NER (Named Entity Recognition)

💬 Enter text: Elon Musk founded SpaceX in California.
→ Entities:
  - Elon Musk  (PERSON)
  - SpaceX     (ORG)
  - California (LOC)

🔹 Moderation

💬 Enter text: I hate everything!
→ Result: FLAGGED (toxic content detected)

🧠 Architecture Overview

The internal structure follows a clean Command ↔ Pipeline ↔ Display pattern:

           ┌──────────────────────┐
           │     InteractiveCLI   │
           │ (src/cli/base.py)    │
           └──────────┬───────────┘
                      │
                      ▼
             ┌─────────────────┐
             │   Command Layer │  ← e.g. sentiment.py
             │ (user commands) │
             └───────┬─────────┘
                     │
                     ▼
             ┌─────────────────┐
             │  Pipeline Layer │  ← e.g. pipelines/sentiment.py
             │ (ML logic)      │
             └───────┬─────────┘
                     │
                     ▼
             ┌─────────────────┐
             │ Display Layer   │  ← cli/display.py
             │ (format output) │
             └─────────────────┘

Key Concepts

Layer	Description
CLI	Manages user input/output, help menus, and navigation between commands.
Command	Encapsulates a single user-facing operation (e.g., run sentiment).
Pipeline	Wraps Hugging Face’s `transformers.pipeline()` to perform inference.
Display	Handles clean console rendering (colored output, tables, JSON formatting).
Config	Centralizes model names, limits, and global constants.

⚙️ Configuration

All configuration is centralized in src/config/settings.py.

Example:

class Config:
    DEFAULT_MODELS = {
        "sentiment": "distilbert-base-uncased-finetuned-sst-2-english",
        "fillmask":  "bert-base-uncased",
        "textgen":   "gpt2",
        "ner":       "dslim/bert-base-NER",
        "moderation":"unitary/toxic-bert"
    }
    MAX_LENGTH = 512
    BATCH_SIZE = 8

You can easily modify model names to experiment with different checkpoints.

🧩 Extending the Playground

To create a new experiment (e.g., keyword extraction):

Duplicate src/pipelines/template.py → src/pipelines/keywords.py
Implement the run() or analyze() logic using a new Hugging Face pipeline.
Create a Command in src/commands/keywords.py to interact with users.
Register the command inside src/main.py:

from src.commands.keywords import KeywordsCommand
cli.register_command(KeywordsCommand())

Optionally, add a model name in Config.DEFAULT_MODELS.

🧪 Testing

You can use pytest for lightweight validation:

pip install pytest
pytest -q

Recommended structure:

tests/
├── test_sentiment.py
├── test_textgen.py
└── ...

🧰 Troubleshooting

Issue	Cause / Solution
`transformers` not found	Check virtual environment activation.
Torch fails to install	Install CPU-only version from PyTorch index.
Models download slowly	Hugging Face caches them after first run.
Unicode / accents broken	Ensure terminal encoding is UTF‑8.

🧭 Development Guidelines

Keep Command classes lightweight — no ML logic inside them.
Reuse the Pipeline Template for new experiments.
Format outputs consistently via the DisplayFormatter.
Document all new models or commands in README.md and settings.py.

🧱 Roadmap

Add non-interactive CLI flags (--text, --task)
Add multilingual model options
Add automatic test coverage
Add logging and profiling utilities
Add export to JSON/CSV results

🪪 License

You can include a standard open-source license such as MIT or Apache 2.0 depending on your use case.

🤝 Contributing

This repository is meant as an educational sandbox for experimenting with Transformers.
Pull requests are welcome for new models, better CLI UX, or educational improvements.

✨ Key Takeaways

Modular and pedagogical design for training environments
Clean separation between I/O, ML logic, and UX
Easily extensible architecture for adding custom pipelines
Perfect sandbox for students, researchers, and developers to learn modern NLP tools

🧩 Built for experimentation. Learn, break, and rebuild.

8.8 KiB Raw Blame History Unescape Escape