# 🧠 AI Lab – Transformers CLI Playground > A **pedagogical and technical project** designed for AI practitioners and students to experiment with Hugging Face Transformers through an **interactive Command‑Line Interface (CLI)**. > This playground provides ready‑to‑use NLP pipelines (Sentiment Analysis, Named Entity Recognition, Text Generation, Fill‑Mask, Moderation, etc.) in a modular, extensible, and educational codebase. --- ## πŸ“š Overview The **AI Lab – Transformers CLI Playground** allows you to explore multiple natural language processing tasks directly from the terminal. Each task (e.g., sentiment, NER, text generation) is implemented as a **Command Module**, which interacts with a **Pipeline Module** built on top of the `transformers` library. The lab is intentionally structured to demonstrate **clean software design for ML codebases** β€” with strict separation between configuration, pipelines, CLI logic, and display formatting. --- ## πŸ—‚οΈ Project Structure ```text src/ β”œβ”€β”€ __init__.py β”œβ”€β”€ main.py # CLI entry point β”‚ β”œβ”€β”€ cli/ β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ base.py # CLICommand base class & interactive shell handler β”‚ └── display.py # Console formatting utilities (tables, colors, results) β”‚ β”œβ”€β”€ commands/ # User-facing commands wrapping pipeline logic β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ sentiment.py # Sentiment analysis command β”‚ β”œβ”€β”€ fillmask.py # Masked token prediction command β”‚ β”œβ”€β”€ textgen.py # Text generation command β”‚ β”œβ”€β”€ ner.py # Named Entity Recognition command β”‚ └── moderation.py # Toxicity / content moderation command β”‚ β”œβ”€β”€ pipelines/ # Machine learning logic (Hugging Face Transformers) β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ template.py # Blueprint for creating new pipelines β”‚ β”œβ”€β”€ sentiment.py β”‚ β”œβ”€β”€ fillmask.py β”‚ β”œβ”€β”€ textgen.py β”‚ β”œβ”€β”€ ner.py β”‚ └── moderation.py β”‚ β”œβ”€β”€ api/ β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ app.py # FastAPI application with all endpoints β”‚ β”œβ”€β”€ models.py # Pydantic request/response models β”‚ └── config.py # API-specific configuration β”‚ └── config/ β”œβ”€β”€ __init__.py └── settings.py # Global configuration (default models, parameters) ``` --- ## βš™οΈ Installation ### 🧾 Option 1 – Using Poetry (Recommended) > Poetry is used as the main dependency manager. ```bash # 1. Create and activate a new virtual environment poetry shell # 2. Install dependencies poetry install ``` This will automatically install all dependencies declared in `pyproject.toml`, including **transformers**, **torch**, and **FastAPI** for the API mode. To run the application inside the Poetry environment: ```bash # CLI mode poetry run python src/main.py --mode cli # API mode poetry run python src/main.py --mode api ``` --- ### πŸ“¦ Option 2 – Using pip and requirements.txt If you prefer using `requirements.txt` manually: ```bash # 1. Create a virtual environment python -m venv .venv # 2. Activate it # Linux/macOS source .venv/bin/activate # Windows PowerShell .venv\Scripts\Activate.ps1 # 3. Install dependencies pip install -r requirements.txt ``` --- ## ▢️ Usage The application supports two modes: **CLI** (interactive) and **API** (REST server). ### πŸ–₯️ CLI Mode Launch the interactive CLI with: ```bash python -m src.main --mode cli # or, if using Poetry poetry run python src/main.py --mode cli ``` You'll see an interactive menu listing the available commands: ``` Welcome to AI Lab - Transformers CLI Playground Available commands: β€’ sentiment – Analyze the sentiment of a text β€’ fillmask – Predict masked words in a sentence β€’ textgen – Generate text from a prompt β€’ ner – Extract named entities from text β€’ moderation – Detect toxic or unsafe content ``` ### 🌐 API Mode Launch the FastAPI server with: ```bash python -m src.main --mode api # or with custom settings python -m src.main --mode api --host 0.0.0.0 --port 8000 --reload ``` The API will be available at: - **Swagger Documentation**: http://localhost:8000/docs - **ReDoc Documentation**: http://localhost:8000/redoc - **OpenAPI Schema**: http://localhost:8000/openapi.json ## πŸ“‘ API Endpoints The REST API provides all CLI functionality through HTTP endpoints: ### Core Endpoints | Method | Endpoint | Description | | ------ | --------- | -------------------------------- | | `GET` | `/` | Health check and API information | | `GET` | `/health` | Detailed health status | ### Individual Processing | Method | Endpoint | Description | Input | | ------ | ------------- | ------------------------ | ------------------------------------------------------------------ | | `POST` | `/sentiment` | Analyze text sentiment | `{"text": "string", "model": "optional"}` | | `POST` | `/fillmask` | Fill masked words | `{"text": "Hello [MASK]", "model": "optional"}` | | `POST` | `/textgen` | Generate text | `{"text": "prompt", "model": "optional"}` | | `POST` | `/ner` | Named entity recognition | `{"text": "string", "model": "optional"}` | | `POST` | `/qa` | Question answering | `{"question": "string", "context": "string", "model": "optional"}` | | `POST` | `/moderation` | Content moderation | `{"text": "string", "model": "optional"}` | ### Batch Processing | Method | Endpoint | Description | Input | | ------ | ------------------- | ------------------------------------ | ---------------------------------------------------- | | `POST` | `/sentiment/batch` | Process multiple texts | `{"texts": ["text1", "text2"], "model": "optional"}` | | `POST` | `/fillmask/batch` | Fill multiple masked texts | `{"texts": ["text1 [MASK]"], "model": "optional"}` | | `POST` | `/textgen/batch` | Generate from multiple prompts | `{"texts": ["prompt1"], "model": "optional"}` | | `POST` | `/ner/batch` | Extract entities from multiple texts | `{"texts": ["text1"], "model": "optional"}` | | `POST` | `/moderation/batch` | Moderate multiple texts | `{"texts": ["text1"], "model": "optional"}` | ### Example API Usage #### πŸ”Ή Sentiment Analysis ```bash curl -X POST "http://localhost:8000/sentiment" \ -H "Content-Type: application/json" \ -d '{"text": "I absolutely love this project!"}' ``` Response: ```json { "success": true, "label": "POSITIVE", "score": 0.998, "model_used": "distilbert-base-uncased-finetuned-sst-2-english" } ``` #### πŸ”Ή Named Entity Recognition ```bash curl -X POST "http://localhost:8000/ner" \ -H "Content-Type: application/json" \ -d '{"text": "Elon Musk founded SpaceX in California."}' ``` Response: ```json { "success": true, "entities": [ { "word": "Elon Musk", "label": "PERSON", "score": 0.999 }, { "word": "SpaceX", "label": "ORG", "score": 0.998 }, { "word": "California", "label": "LOC", "score": 0.995 } ], "model_used": "dslim/bert-base-NER" } ``` #### πŸ”Ή Batch Processing ```bash curl -X POST "http://localhost:8000/sentiment/batch" \ -H "Content-Type: application/json" \ -d '{"texts": ["Great product!", "Terrible experience", "It was okay"]}' ``` Response: ```json { "success": true, "results": [ { "label": "POSITIVE", "score": 0.998 }, { "label": "NEGATIVE", "score": 0.995 }, { "label": "NEUTRAL", "score": 0.876 } ], "model_used": "distilbert-base-uncased-finetuned-sst-2-english" } ``` --- ## πŸ–₯️ CLI Examples #### πŸ”Ή Sentiment Analysis ```text πŸ’¬ Enter text: I absolutely love this project! β†’ Sentiment: POSITIVE (score: 0.998) ``` #### πŸ”Ή Fill‑Mask ```text πŸ’¬ Enter text: The capital of France is [MASK]. β†’ Predictions: 1) Paris score: 0.87 2) Lyon score: 0.04 3) London score: 0.02 ``` #### πŸ”Ή Text Generation ```text πŸ’¬ Prompt: Once upon a time β†’ Output: Once upon a time there was a young AI learning to code... ``` #### πŸ”Ή NER (Named Entity Recognition) ```text πŸ’¬ Enter text: Elon Musk founded SpaceX in California. β†’ Entities: - Elon Musk (PERSON) - SpaceX (ORG) - California (LOC) ``` #### πŸ”Ή Moderation ```text πŸ’¬ Enter text: I hate everything! β†’ Result: FLAGGED (toxic content detected) ``` --- ## 🧠 Architecture Overview The application supports dual-mode architecture: **CLI** (interactive) and **API** (REST server), both sharing the same pipeline layer: ### CLI Architecture ```text β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ InteractiveCLI β”‚ β”‚ (src/cli/base.py) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Command Layer β”‚ ← e.g. sentiment.py β”‚ (user commands) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Pipeline Layer β”‚ ← e.g. pipelines/sentiment.py β”‚ (ML logic) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Display Layer β”‚ ← cli/display.py β”‚ (format output) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ### API Architecture ```text β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ FastAPI App β”‚ β”‚ (src/api/app.py) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Pydantic Models β”‚ ← api/models.py β”‚ (validation) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Pipeline Layer β”‚ ← e.g. pipelines/sentiment.py β”‚ (ML logic) β”‚ (shared with CLI) β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ JSON Response β”‚ ← automatic serialization β”‚ (HTTP output) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ### Key Concepts | Layer | Description | | ------------ | -------------------------------------------------------------------------- | | **CLI** | Manages user input/output, help menus, and navigation between commands. | | **API** | FastAPI application serving HTTP endpoints with automatic documentation. | | **Command** | Encapsulates a single user-facing operation (e.g., run sentiment). | | **Pipeline** | Wraps Hugging Face's `transformers.pipeline()` to perform inference. | | **Models** | Pydantic schemas for request/response validation and serialization. | | **Display** | Handles clean console rendering (colored output, tables, JSON formatting). | ### Key Concepts | Layer | Description | | ------------ | -------------------------------------------------------------------------- | | **CLI** | Manages user input/output, help menus, and navigation between commands. | | **Command** | Encapsulates a single user-facing operation (e.g., run sentiment). | | **Pipeline** | Wraps Hugging Face’s `transformers.pipeline()` to perform inference. | | **Display** | Handles clean console rendering (colored output, tables, JSON formatting). | | **Config** | Centralizes model names, limits, and global constants. | --- ## βš™οΈ Configuration All configuration is centralized in `src/config/settings.py`. Example: ```python class Config: DEFAULT_MODELS = { "sentiment": "distilbert-base-uncased-finetuned-sst-2-english", "fillmask": "bert-base-uncased", "textgen": "gpt2", "ner": "dslim/bert-base-NER", "moderation":"unitary/toxic-bert" } MAX_LENGTH = 512 BATCH_SIZE = 8 ``` You can easily modify model names to experiment with different checkpoints. --- ## 🧩 Extending the Playground To create a new experiment (e.g., keyword extraction): ### For CLI Support 1. **Duplicate** `src/pipelines/template.py` β†’ `src/pipelines/keywords.py` Implement the `run()` or `analyze()` logic using a new Hugging Face pipeline. 2. **Create a Command** in `src/commands/keywords.py` to interact with users. 3. **Register the command** inside `src/main.py`: ```python from src.commands.keywords import KeywordsCommand cli.register_command(KeywordsCommand()) ``` ### For API Support 4. **Add Pydantic models** in `src/api/models.py`: ```python class KeywordsRequest(BaseModel): text: str model: Optional[str] = None class KeywordsResponse(BaseModel): success: bool keywords: List[str] model_used: str ``` 5. **Add endpoint** in `src/api/app.py`: ```python @app.post("/keywords", response_model=KeywordsResponse) async def extract_keywords(request: KeywordsRequest): # Implementation using KeywordsAnalyzer pipeline pass ``` 6. **Update configuration** in `Config.DEFAULT_MODELS`. Both CLI and API will automatically share the same pipeline implementation! --- ## πŸ§ͺ Testing You can use `pytest` for lightweight validation: ```bash pip install pytest pytest -q ``` Recommended structure: ``` tests/ β”œβ”€β”€ test_sentiment.py β”œβ”€β”€ test_textgen.py └── ... ``` --- ## 🧰 Troubleshooting ### General Issues | Issue | Cause / Solution | | ---------------------------- | -------------------------------------------- | | **`transformers` not found** | Check virtual environment activation. | | **Torch fails to install** | Install CPU-only version from PyTorch index. | | **Models download slowly** | Hugging Face caches them after first run. | | **Unicode / accents broken** | Ensure terminal encoding is UTF‑8. | ### API-Specific Issues | Issue | Cause / Solution | | ----------------------------- | ----------------------------------------------------- | | **`FastAPI` not found** | Install with `pip install fastapi uvicorn[standard]`. | | **Port already in use** | Use `--port 8001` or kill process on port 8000. | | **CORS errors in browser** | Check `allow_origins` in `src/api/config.py`. | | **422 Validation Error** | Check request body matches Pydantic models. | | **500 Internal Server Error** | Check model loading and pipeline initialization. | ### Quick API Health Check ```bash # Test if API is running curl http://localhost:8000/health # Test basic endpoint curl -X POST "http://localhost:8000/sentiment" \ -H "Content-Type: application/json" \ -d '{"text": "test"}' ``` --- ## 🧭 Development Guidelines - Keep **Command** classes lightweight β€” no ML logic inside them. - Reuse the **Pipeline Template** for new experiments. - Format outputs consistently via the `DisplayFormatter`. - Document all new models or commands in `README.md` and `settings.py`. --- ## 🧱 Roadmap - [ ] Add non-interactive CLI flags (`--text`, `--task`) - [ ] Add multilingual model options - [ ] Add automatic test coverage - [ ] Add logging and profiling utilities - [ ] Add export to JSON/CSV results --- ## πŸ“œ License This project is licensed under the [MIT License](./LICENSE) β€” feel free to use it, modify it, and share it! ---