10 KiB

Raw Blame History

🧠 AI Lab – Transformers CLI Playground

A pedagogical and technical project designed for AI practitioners and students to explore Hugging Face Transformers through an interactive Command-Line Interface (CLI) or a REST API.
This playground provides ready-to-use NLP pipelines — including Sentiment Analysis, Named Entity Recognition, Text Generation, Fill-Mask, Question Answering (QA), Moderation, and more — in a modular, extensible, and educational codebase.

📑 Table of Contents

📚 Overview
🗂️ Project Structure
⚙️ Installation
- 🧾 Option 1 – Poetry (Recommended)
- 📦 Option 2 – Pip + Requirements
▶️ Usage
- 🖥️ CLI Mode
- 🌐 API Mode
📡 API Endpoints
🖥️ CLI Examples
🧠 Architecture Overview
⚙️ Configuration
🧩 Extending the Playground
🧰 Troubleshooting
🧭 Development Guidelines
🧱 Roadmap
📜 License

📚 Overview

The AI Lab – Transformers CLI Playground enables users to explore multiple NLP tasks directly from the terminal or via HTTP APIs.
Each task (sentiment, NER, text generation, etc.) is implemented as a Command Module that communicates with a Pipeline Module powered by Hugging Face’s transformers library.

The project demonstrates clean ML code architecture with strict separation between:

Configuration
Pipelines
CLI logic
Display formatting

It’s a great educational resource for learning how to structure ML applications professionally.

🗂️ Project Structure

src/
├── main.py                 # CLI entry point
│
├── cli/
│   ├── base.py             # CLICommand base class & interactive shell
│   └── display.py          # Console formatting utilities (colors, tables, results)
│
├── commands/               # User-facing commands wrapping pipeline logic
│   ├── sentiment.py        # Sentiment analysis command
│   ├── fillmask.py         # Masked token prediction
│   ├── textgen.py          # Text generation
│   ├── ner.py              # Named Entity Recognition
│   ├── qa.py               # Question Answering (extractive)
│   └── moderation.py       # Content moderation / toxicity detection
│
├── pipelines/              # ML logic based on Hugging Face pipelines
│   ├── template.py         # Blueprint for creating new pipelines
│   ├── sentiment.py
│   ├── fillmask.py
│   ├── textgen.py
│   ├── ner.py
│   ├── qa.py
│   └── moderation.py
│
├── api/
│   ├── app.py              # FastAPI app and endpoints
│   ├── models.py           # Pydantic schemas
│   └── config.py           # API configuration
│
└── config/
    └── settings.py         # Global configuration (models, params)

⚙️ Installation

🧾 Option 1 – Poetry (Recommended)

Poetry is the main dependency manager for this project.

poetry shell
poetry install

This installs all dependencies defined in pyproject.toml (including transformers, torch, and fastapi).

Run the app:

# CLI mode
poetry run python src/main.py --mode cli

# API mode
poetry run python src/main.py --mode api

📦 Option 2 – Pip + requirements.txt

If you prefer manual dependency management:

python -m venv .venv
source .venv/bin/activate      # Linux/macOS
.venv\Scripts\Activate.ps1     # Windows

pip install -r requirements.txt

▶️ Usage

🖥️ CLI Mode

Run the interactive CLI:

python -m src.main --mode cli

Interactive menu:

Welcome to AI Lab - Transformers CLI Playground
Available commands:
  • sentiment     – Analyze the sentiment of a text
  • fillmask      – Predict masked words in a sentence
  • textgen       – Generate text from a prompt
  • ner           – Extract named entities from text
  • qa            – Answer questions from a context
  • moderation    – Detect toxic or unsafe content

🌐 API Mode

Run FastAPI server:

python -m src.main --mode api
# Custom config
python -m src.main --mode api --host 0.0.0.0 --port 8000 --reload

API Docs:

Swagger → http://localhost:8000/docs
ReDoc → http://localhost:8000/redoc
OpenAPI → http://localhost:8000/openapi.json

📡 API Endpoints

Core Endpoints

Method	Endpoint	Description
`GET`	`/`	Health check and API info
`GET`	`/health`	Detailed health status

Individual Processing

Method	Endpoint	Description
`POST`	`/sentiment`	Analyze text sentiment
`POST`	`/fillmask`	Predict masked words
`POST`	`/textgen`	Generate text
`POST`	`/ner`	Extract named entities
`POST`	`/qa`	Question answering
`POST`	`/moderation`	Content moderation

Batch Processing

Method	Endpoint	Description
`POST`	`/sentiment/batch`	Process multiple texts
`POST`	`/fillmask/batch`	Fill multiple masked texts
`POST`	`/textgen/batch`	Generate from prompts
`POST`	`/ner/batch`	Extract entities in batch
`POST`	`/qa/batch`	Answer questions in batch
`POST`	`/moderation/batch`	Moderate multiple texts

🖥️ CLI Examples

🔹 Sentiment Analysis

💬 Enter text: I absolutely love this project!
→ Sentiment: POSITIVE (score: 0.998)

🔹 Fill-Mask

💬 Enter text: The capital of France is [MASK].
→ Predictions:
  1) Paris      score: 0.87
  2) Lyon       score: 0.04

🔹 Text Generation

💬 Prompt: Once upon a time
→ Output: Once upon a time there was a young AI learning to code...

🔹 NER

💬 Enter text: Elon Musk founded SpaceX in California.
→ Entities:
  - Elon Musk  (PERSON)
  - SpaceX     (ORG)
  - California (LOC)

🔹 QA (Question Answering)

💬 Enter question: What is the capital of France?
💬 Enter context: France is a country in Europe. Its capital is Paris.
→ Answer: The capital of France is Paris.

🔹 Moderation

💬 Enter text: I hate everything!
→ Result: FLAGGED (toxic content detected)

🧠 Architecture Overview

Both CLI and API share the same pipeline layer, ensuring code reusability and consistency.

CLI Architecture

InteractiveCLI → Command Layer → Pipeline Layer → Display Layer

API Architecture

FastAPI App → Pydantic Models → Pipeline Layer → JSON Response

Layer	Description
CLI	Manages user input/output and navigation.
API	Exposes endpoints with automatic OpenAPI docs.
Command	Encapsulates user-facing operations.
Pipeline	Wraps Hugging Face’s pipelines.
Models	Validates requests/responses.
Display	Formats console output.

⚙️ Configuration

All configuration is centralized in src/config/settings.py:

class Config:
    DEFAULT_MODELS = {
        "sentiment": "distilbert-base-uncased-finetuned-sst-2-english",
        "fillmask":  "bert-base-uncased",
        "textgen":   "gpt2",
        "ner":       "dslim/bert-base-NER",
        "qa":        "distilbert-base-cased-distilled-squad",
        "moderation":"unitary/toxic-bert",
    }
    MAX_LENGTH = 512
    BATCH_SIZE = 8

🧩 Extending the Playground

To add a new NLP experiment (e.g., keyword extraction):

Duplicate src/pipelines/template.py → src/pipelines/keywords.py
Create a command: src/commands/keywords.py
Register it in src/main.py
Add Pydantic models and API endpoint
Update Config.DEFAULT_MODELS

Both CLI and API will automatically share this logic.

🧰 Troubleshooting

Issue	Solution
`transformers` not found	Activate your venv.
Torch install fails	Use CPU-only wheel.
Models download slowly	Cached after first use.
Encoding issues	Ensure UTF-8 terminal.

API Issues

Issue	Solution
`FastAPI` missing	`pip install fastapi uvicorn[standard]`
Port in use	Change with `--port 8001`
CORS error	Edit `allow_origins` in `api/config.py`
Validation error 422	Check request body
500 error	Verify model loading

🧭 Development Guidelines

Keep command classes lightweight (no ML inside)
Use the pipeline template for new tasks
Format all outputs via DisplayFormatter
Document new commands and models

🧱 Roadmap

Non-interactive CLI flags (--text, --task)
Multilingual models
Test coverage
Logging & profiling
Export to JSON/CSV

📜 License

Licensed under the MIT License.
You are free to use, modify, and distribute this project.

✨ End of Documentation
The AI Lab – Transformers CLI Playground: built for learning, experimenting, and sharing NLP excellence.

10 KiB Raw Blame History Unescape Escape