Compare commits

...

7 Commits

26 changed files with 6061 additions and 479 deletions

1
.gitignore vendored
View File

@ -22,6 +22,7 @@ var/
*.egg-info/
.installed.cfg
*.egg
.github/
# Virtual environments
venv/

178
POSTMAN_GUIDE.md Normal file
View File

@ -0,0 +1,178 @@
# 📮 Collection Postman - AI Lab API
## 📋 Overview
Cette collection Postman complète contient tous les endpoints de l'API AI Lab avec des exemples prêts à utiliser pour tester chaque pipeline NLP.
## 📁 Fichiers inclus
- **`AI_Lab_API.postman_collection.json`** - Collection principale avec tous les endpoints
- **`AI_Lab_API.postman_environment.json`** - Environnement avec variables configurables
- **`POSTMAN_GUIDE.md`** - Ce guide d'utilisation
## 🚀 Installation et Configuration
### 1. Importer dans Postman
1. Ouvrez Postman
2. Cliquez sur **Import** (bouton en haut à gauche)
3. Sélectionnez **Upload Files**
4. Importez les deux fichiers :
- `AI_Lab_API.postman_collection.json`
- `AI_Lab_API.postman_environment.json`
### 2. Configurer l'environnement
1. Cliquez sur l'icône **Settings** (⚙️) en haut à droite
2. Sélectionnez **"AI Lab API Environment"**
3. Modifiez `base_url` si nécessaire (par défaut : `http://localhost:8000`)
### 3. Démarrer l'API
Avant d'utiliser Postman, assurez-vous que l'API est démarrée :
```bash
# Dans le dossier du projet
python -m src.main --mode api
# ou
poetry run python src/main.py --mode api --host 0.0.0.0 --port 8000
```
## 📊 Structure de la Collection
### 🏠 Core Endpoints
- **Root** - Informations générales de l'API
- **Health Check** - Statut de l'API et pipelines chargés
### 💭 Sentiment Analysis
- **Analyze Sentiment - Positive** - Test avec texte positif
- **Analyze Sentiment - Negative** - Test avec texte négatif
- **Analyze Sentiment - Custom Model** - Test avec modèle personnalisé
- **Batch Sentiment Analysis** - Traitement par lot
### 🏷️ Named Entity Recognition
- **Extract Entities - People & Organizations** - Entités personnes/organisations
- **Extract Entities - Geographic** - Entités géographiques
- **Batch NER Processing** - Traitement par lot
### ❓ Question Answering
- **Simple Q&A** - Questions simples
- **Technical Q&A** - Questions techniques
### 🎭 Fill Mask
- **Fill Simple Mask** - Masques simples
- **Fill Technical Mask** - Masques techniques
- **Batch Fill Mask** - Traitement par lot
### 🛡️ Content Moderation
- **Check Safe Content** - Contenu sûr
- **Check Potentially Toxic Content** - Contenu potentiellement toxique
- **Batch Content Moderation** - Traitement par lot
### ✍️ Text Generation
- **Generate Creative Text** - Génération créative
- **Generate Technical Text** - Génération technique
- **Batch Text Generation** - Traitement par lot
### 🧪 Testing & Examples
- **Complete Pipeline Test** - Test complet
- **Error Handling Test - Empty Text** - Gestion d'erreurs (texte vide)
- **Error Handling Test - Invalid Model** - Gestion d'erreurs (modèle invalide)
## 🔧 Utilisation Avancée
### Variables d'environnement disponibles
| Variable | Description | Valeur par défaut |
| ----------------- | ------------------------- | ----------------------- |
| `base_url` | URL de base de l'API | `http://localhost:8000` |
| `api_version` | Version de l'API | `1.0.0` |
| `timeout` | Timeout des requêtes (ms) | `30000` |
| `default_*_model` | Modèles par défaut | Voir environnement |
### Personnalisation des modèles
Vous pouvez tester différents modèles en modifiant le champ `model_name` dans le body des requêtes :
```json
{
"text": "Your text here",
"model_name": "cardiffnlp/twitter-roberta-base-sentiment-latest"
}
```
### Tests automatiques
Chaque requête inclut des tests automatiques :
- ✅ Temps de réponse < 30 secondes
- ✅ Header Content-Type présent
- ✅ Logs automatiques dans la console
## 📈 Exemples d'utilisation
### 1. Test rapide de l'API
1. Exécutez **"Health Check"** pour vérifier que l'API fonctionne
2. Testez **"Analyze Sentiment - Positive"** pour un premier test
### 2. Test complet d'un pipeline
1. Commencez par un test simple (ex: sentiment positif)
2. Testez avec un modèle personnalisé
3. Testez le traitement par lot
4. Testez la gestion d'erreurs
### 3. Benchmark de performance
1. Utilisez **"Batch Text Generation"** avec plusieurs prompts
2. Surveillez les temps de réponse dans l'onglet Tests
3. Ajustez le timeout si nécessaire
## 🐛 Dépannage
### API non accessible
- Vérifiez que l'API est démarrée sur le bon port
- Modifiez `base_url` dans l'environnement si nécessaire
### Erreurs 422 (Validation Error)
- Vérifiez le format JSON du body
- Assurez-vous que les champs requis sont présents
### Erreurs 503 (Service Unavailable)
- Pipeline non chargé - vérifiez les logs de l'API
- Redémarrez l'API si nécessaire
### Timeouts
- Augmentez la valeur `timeout` dans l'environnement
- Certains modèles peuvent être lents au premier chargement
## 🎯 Bonnes Pratiques
1. **Démarrez toujours par Health Check** pour vérifier l'état de l'API
2. **Utilisez l'environnement** pour centraliser la configuration
3. **Consultez les logs** dans la console Postman pour déboguer
4. **Testez progressivement** : simple → personnalisé → batch → erreurs
5. **Documentez vos tests** en ajoutant des descriptions aux requêtes
## 🔗 Liens Utiles
- **Documentation Swagger** : http://localhost:8000/docs (quand l'API est active)
- **Documentation ReDoc** : http://localhost:8000/redoc
- **Schéma OpenAPI** : http://localhost:8000/openapi.json
---
**Happy Testing! 🚀**

333
README.md
View File

@ -1,16 +1,54 @@
# 🧠 AI Lab Transformers CLI Playground
> A **pedagogical and technical project** designed for AI practitioners and students to experiment with Hugging Face Transformers through an **interactive CommandLine Interface (CLI)**.
> This playground provides readytouse NLP pipelines (Sentiment Analysis, Named Entity Recognition, Text Generation, FillMask, Moderation, etc.) in a modular, extensible, and educational codebase.
> A **pedagogical and technical project** designed for AI practitioners and students to explore **Hugging Face Transformers** through an **interactive Command-Line Interface (CLI)** or a **REST API**.
> This playground provides ready-to-use NLP pipelines — including **Sentiment Analysis**, **Named Entity Recognition**, **Text Generation**, **Fill-Mask**, **Question Answering (QA)**, **Moderation**, and more — in a modular, extensible, and educational codebase.
---
<p align="center">
<img src="https://img.shields.io/badge/Python-3.13-blue.svg" alt="Python"/>
<img src="https://img.shields.io/badge/Built_with-Poetry-purple.svg" alt="Poetry"/>
<img src="https://img.shields.io/badge/🤗-Transformers-orange.svg" alt="Transformers"/>
<img src="https://img.shields.io/badge/License-MIT-green.svg" alt="License"/>
</p>
---
## 📑 Table of Contents
- [📚 Overview](#-overview)
- [🗂️ Project Structure](#-project-structure)
- [⚙️ Installation](#-installation)
- [🧾 Option 1 Poetry (Recommended)](#-option-1--poetry-recommended)
- [📦 Option 2 Pip + Requirements](#-option-2--pip--requirements)
- [▶️ Usage](#-usage)
- [🖥️ CLI Mode](#-cli-mode)
- [🌐 API Mode](#-api-mode)
- [📡 API Endpoints](#-api-endpoints)
- [🖥️ CLI Examples](#-cli-examples)
- [🧠 Architecture Overview](#-architecture-overview)
- [⚙️ Configuration](#-configuration)
- [🧩 Extending the Playground](#-extending-the-playground)
- [🧰 Troubleshooting](#-troubleshooting)
- [🧭 Development Guidelines](#-development-guidelines)
- [🧱 Roadmap](#-roadmap)
- [📜 License](#-license)
---
## 📚 Overview
The **AI Lab Transformers CLI Playground** allows you to explore multiple natural language processing tasks directly from the terminal.
Each task (e.g., sentiment, NER, text generation) is implemented as a **Command Module**, which interacts with a **Pipeline Module** built on top of the `transformers` library.
The **AI Lab Transformers CLI Playground** enables users to explore **multiple NLP tasks directly from the terminal or via HTTP APIs**.
Each task (sentiment, NER, text generation, etc.) is implemented as a **Command Module** that communicates with a **Pipeline Module** powered by Hugging Faces `transformers` library.
The lab is intentionally structured to demonstrate **clean software design for ML codebases** — with strict separation between configuration, pipelines, CLI logic, and display formatting.
The project demonstrates **clean ML code architecture** with strict separation between:
- Configuration
- Pipelines
- CLI logic
- Display formatting
Its a great educational resource for learning **how to structure ML applications** professionally.
---
@ -18,77 +56,74 @@ The lab is intentionally structured to demonstrate **clean software design for M
```text
src/
├── __init__.py
├── main.py # CLI entry point
├── cli/
│ ├── __init__.py
│ ├── base.py # CLICommand base class & interactive shell handler
│ └── display.py # Console formatting utilities (tables, colors, results)
│ ├── base.py # CLICommand base class & interactive shell
│ └── display.py # Console formatting utilities (colors, tables, results)
├── commands/ # User-facing commands wrapping pipeline logic
│ ├── __init__.py
│ ├── sentiment.py # Sentiment analysis command
│ ├── fillmask.py # Masked token prediction command
│ ├── textgen.py # Text generation command
│ ├── ner.py # Named Entity Recognition command
│ └── moderation.py # Toxicity / content moderation command
│ ├── fillmask.py # Masked token prediction
│ ├── textgen.py # Text generation
│ ├── ner.py # Named Entity Recognition
│ ├── qa.py # Question Answering (extractive)
│ └── moderation.py # Content moderation / toxicity detection
├── pipelines/ # Machine learning logic (Hugging Face Transformers)
│ ├── __init__.py
├── pipelines/ # ML logic based on Hugging Face pipelines
│ ├── template.py # Blueprint for creating new pipelines
│ ├── sentiment.py
│ ├── fillmask.py
│ ├── textgen.py
│ ├── ner.py
│ ├── qa.py
│ └── moderation.py
├── api/
│ ├── app.py # FastAPI app and endpoints
│ ├── models.py # Pydantic schemas
│ └── config.py # API configuration
└── config/
├── __init__.py
└── settings.py # Global configuration (default models, parameters)
└── settings.py # Global configuration (models, params)
```
---
## ⚙️ Installation
### 🧾 Option 1 Using Poetry (Recommended)
### 🧾 Option 1 Poetry (Recommended)
> Poetry is used as the main dependency manager.
> Poetry is the main dependency manager for this project.
```bash
# 1. Create and activate a new virtual environment
poetry shell
# 2. Install dependencies
poetry install
```
This will automatically install all dependencies declared in `pyproject.toml`, including **transformers** and **torch**.
This installs all dependencies defined in `pyproject.toml` (including `transformers`, `torch`, and `fastapi`).
To run the CLI inside the Poetry environment:
Run the app:
```bash
poetry run python src/main.py
# CLI mode
poetry run python src/main.py --mode cli
# API mode
poetry run python src/main.py --mode api
```
---
### 📦 Option 2 Using pip and requirements.txt
### 📦 Option 2 Pip + requirements.txt
If you prefer using `requirements.txt` manually:
If you prefer manual dependency management:
```bash
# 1. Create a virtual environment
python -m venv .venv
source .venv/bin/activate # Linux/macOS
.venv\Scripts\Activate.ps1 # Windows
# 2. Activate it
# Linux/macOS
source .venv/bin/activate
# Windows PowerShell
.venv\Scripts\Activate.ps1
# 3. Install dependencies
pip install -r requirements.txt
```
@ -96,15 +131,15 @@ pip install -r requirements.txt
## ▶️ Usage
Once installed, launch the CLI with:
### 🖥️ CLI Mode
Run the interactive CLI:
```bash
python -m src.main
# or, if using Poetry
poetry run python src/main.py
python -m src.main --mode cli
```
Youll see an interactive menu listing the available commands:
Interactive menu:
```
Welcome to AI Lab - Transformers CLI Playground
@ -113,36 +148,89 @@ Available commands:
• fillmask Predict masked words in a sentence
• textgen Generate text from a prompt
• ner Extract named entities from text
• qa Answer questions from a context
• moderation Detect toxic or unsafe content
```
### Example Sessions
---
#### 🔹 Sentiment Analysis
### 🌐 API Mode
Run FastAPI server:
```bash
python -m src.main --mode api
# Custom config
python -m src.main --mode api --host 0.0.0.0 --port 8000 --reload
```
API Docs:
- **Swagger** → http://localhost:8000/docs
- **ReDoc** → http://localhost:8000/redoc
- **OpenAPI** → http://localhost:8000/openapi.json
---
## 📡 API Endpoints
### Core Endpoints
| Method | Endpoint | Description |
| ------ | --------- | ------------------------- |
| `GET` | `/` | Health check and API info |
| `GET` | `/health` | Detailed health status |
### Individual Processing
| Method | Endpoint | Description |
| ------ | ------------- | ---------------------- |
| `POST` | `/sentiment` | Analyze text sentiment |
| `POST` | `/fillmask` | Predict masked words |
| `POST` | `/textgen` | Generate text |
| `POST` | `/ner` | Extract named entities |
| `POST` | `/qa` | Question answering |
| `POST` | `/moderation` | Content moderation |
### Batch Processing
| Method | Endpoint | Description |
| ------ | ------------------- | -------------------------- |
| `POST` | `/sentiment/batch` | Process multiple texts |
| `POST` | `/fillmask/batch` | Fill multiple masked texts |
| `POST` | `/textgen/batch` | Generate from prompts |
| `POST` | `/ner/batch` | Extract entities in batch |
| `POST` | `/qa/batch` | Answer questions in batch |
| `POST` | `/moderation/batch` | Moderate multiple texts |
---
## 🖥️ CLI Examples
### 🔹 Sentiment Analysis
```text
💬 Enter text: I absolutely love this project!
→ Sentiment: POSITIVE (score: 0.998)
```
#### 🔹 FillMask
### 🔹 Fill-Mask
```text
💬 Enter text: The capital of France is [MASK].
→ Predictions:
1) Paris score: 0.87
2) Lyon score: 0.04
3) London score: 0.02
```
#### 🔹 Text Generation
### 🔹 Text Generation
```text
💬 Prompt: Once upon a time
→ Output: Once upon a time there was a young AI learning to code...
```
#### 🔹 NER (Named Entity Recognition)
### 🔹 NER
```text
💬 Enter text: Elon Musk founded SpaceX in California.
@ -152,7 +240,15 @@ Available commands:
- California (LOC)
```
#### 🔹 Moderation
### 🔹 QA (Question Answering)
```text
💬 Enter question: What is the capital of France?
💬 Enter context: France is a country in Europe. Its capital is Paris.
→ Answer: The capital of France is Paris.
```
### 🔹 Moderation
```text
💬 Enter text: I hate everything!
@ -163,50 +259,34 @@ Available commands:
## 🧠 Architecture Overview
The internal structure follows a clean **Command ↔ Pipeline ↔ Display** pattern:
Both CLI and API share the **same pipeline layer**, ensuring code reusability and consistency.
### CLI Architecture
```text
┌──────────────────────┐
│ InteractiveCLI │
│ (src/cli/base.py) │
└──────────┬───────────┘
┌─────────────────┐
│ Command Layer │ ← e.g. sentiment.py
│ (user commands) │
└───────┬─────────┘
┌─────────────────┐
│ Pipeline Layer │ ← e.g. pipelines/sentiment.py
│ (ML logic) │
└───────┬─────────┘
┌─────────────────┐
│ Display Layer │ ← cli/display.py
│ (format output) │
└─────────────────┘
InteractiveCLI → Command Layer → Pipeline Layer → Display Layer
```
### Key Concepts
### API Architecture
```text
FastAPI App → Pydantic Models → Pipeline Layer → JSON Response
```
| Layer | Description |
| ------------ | -------------------------------------------------------------------------- |
| **CLI** | Manages user input/output, help menus, and navigation between commands. |
| **Command** | Encapsulates a single user-facing operation (e.g., run sentiment). |
| **Pipeline** | Wraps Hugging Faces `transformers.pipeline()` to perform inference. |
| **Display** | Handles clean console rendering (colored output, tables, JSON formatting). |
| **Config** | Centralizes model names, limits, and global constants. |
| ------------ | ---------------------------------------------- |
| **CLI** | Manages user input/output and navigation. |
| **API** | Exposes endpoints with automatic OpenAPI docs. |
| **Command** | Encapsulates user-facing operations. |
| **Pipeline** | Wraps Hugging Faces pipelines. |
| **Models** | Validates requests/responses. |
| **Display** | Formats console output. |
---
## ⚙️ Configuration
All configuration is centralized in `src/config/settings.py`.
Example:
All configuration is centralized in `src/config/settings.py`:
```python
class Config:
@ -215,88 +295,75 @@ class Config:
"fillmask": "bert-base-uncased",
"textgen": "gpt2",
"ner": "dslim/bert-base-NER",
"moderation":"unitary/toxic-bert"
"qa": "distilbert-base-cased-distilled-squad",
"moderation":"unitary/toxic-bert",
}
MAX_LENGTH = 512
BATCH_SIZE = 8
```
You can easily modify model names to experiment with different checkpoints.
---
## 🧩 Extending the Playground
To create a new experiment (e.g., keyword extraction):
To add a new NLP experiment (e.g., keyword extraction):
1. **Duplicate** `src/pipelines/template.py``src/pipelines/keywords.py`
Implement the `run()` or `analyze()` logic using a new Hugging Face pipeline.
1. Duplicate `src/pipelines/template.py``src/pipelines/keywords.py`
2. Create a command: `src/commands/keywords.py`
3. Register it in `src/main.py`
4. Add Pydantic models and API endpoint
5. Update `Config.DEFAULT_MODELS`
2. **Create a Command** in `src/commands/keywords.py` to interact with users.
3. **Register the command** inside `src/main.py`:
```python
from src.commands.keywords import KeywordsCommand
cli.register_command(KeywordsCommand())
```
4. Optionally, add a model name in `Config.DEFAULT_MODELS`.
---
## 🧪 Testing
You can use `pytest` for lightweight validation:
```bash
pip install pytest
pytest -q
```
Recommended structure:
```
tests/
├── test_sentiment.py
├── test_textgen.py
└── ...
```
Both CLI and API will automatically share this logic.
---
## 🧰 Troubleshooting
| Issue | Cause / Solution |
| ---------------------------- | -------------------------------------------- |
| **`transformers` not found** | Check virtual environment activation. |
| **Torch fails to install** | Install CPU-only version from PyTorch index. |
| **Models download slowly** | Hugging Face caches them after first run. |
| **Unicode / accents broken** | Ensure terminal encoding is UTF8. |
| Issue | Solution |
| ------------------------ | ----------------------- |
| `transformers` not found | Activate your venv. |
| Torch install fails | Use CPU-only wheel. |
| Models download slowly | Cached after first use. |
| Encoding issues | Ensure UTF-8 terminal. |
### API Issues
| Issue | Solution |
| -------------------- | --------------------------------------- |
| `FastAPI` missing | `pip install fastapi uvicorn[standard]` |
| Port in use | Change with `--port 8001` |
| CORS error | Edit `allow_origins` in `api/config.py` |
| Validation error 422 | Check request body |
| 500 error | Verify model loading |
---
## 🧭 Development Guidelines
- Keep **Command** classes lightweight — no ML logic inside them.
- Reuse the **Pipeline Template** for new experiments.
- Format outputs consistently via the `DisplayFormatter`.
- Document all new models or commands in `README.md` and `settings.py`.
- Keep command classes lightweight (no ML inside)
- Use the pipeline template for new tasks
- Format all outputs via `DisplayFormatter`
- Document new commands and models
---
## 🧱 Roadmap
- [ ] Add non-interactive CLI flags (`--text`, `--task`)
- [ ] Add multilingual model options
- [ ] Add automatic test coverage
- [ ] Add logging and profiling utilities
- [ ] Add export to JSON/CSV results
- [ ] Non-interactive CLI flags (`--text`, `--task`)
- [ ] Multilingual models
- [ ] Test coverage
- [ ] Logging & profiling
- [ ] Export to JSON/CSV
---
## 📜 License
This project is licensed under the [MIT License](./LICENSE) — feel free to use it, modify it, and share it!
Licensed under the [MIT License](./LICENSE).
You are free to use, modify, and distribute this project.
---
**End of Documentation**
_The AI Lab Transformers CLI Playground: built for learning, experimenting, and sharing NLP excellence._

68
demo_api.sh Executable file
View File

@ -0,0 +1,68 @@
#!/bin/bash
echo "🚀 AI Lab API Demo"
echo "=================="
echo ""
# Configuration
API_BASE="http://127.0.0.1:8000"
echo "📊 Health Check:"
curl -s -X GET "$API_BASE/health" | python3 -m json.tool
echo ""
echo "😊 Sentiment Analysis:"
curl -s -X POST "$API_BASE/sentiment" \
-H "Content-Type: application/json" \
-d '{"text": "This API is absolutely amazing! I love it!"}' | python3 -m json.tool
echo ""
echo "🔍 Named Entity Recognition:"
curl -s -X POST "$API_BASE/ner" \
-H "Content-Type: application/json" \
-d '{"text": "Apple Inc. was founded by Steve Jobs in Cupertino, California."}' | python3 -m json.tool
echo ""
echo "❓ Question Answering:"
curl -s -X POST "$API_BASE/qa" \
-H "Content-Type: application/json" \
-d '{
"question": "Who founded Apple?",
"context": "Apple Inc. was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in April 1976."
}' | python3 -m json.tool
echo ""
echo "<22> Fill Mask:"
curl -s -X POST "$API_BASE/fillmask" \
-H "Content-Type: application/json" \
-d '{"text": "The capital of France is [MASK]."}' | python3 -m json.tool
echo ""
echo "🛡️ Content Moderation:"
curl -s -X POST "$API_BASE/moderation" \
-H "Content-Type: application/json" \
-d '{"text": "This is a completely normal and safe text."}' | python3 -m json.tool
echo ""
echo "✍️ Text Generation:"
curl -s -X POST "$API_BASE/textgen" \
-H "Content-Type: application/json" \
-d '{"text": "Once upon a time, in a distant galaxy"}' | python3 -m json.tool
echo ""
echo "📦 Batch Sentiment Analysis:"
curl -s -X POST "$API_BASE/sentiment/batch" \
-H "Content-Type: application/json" \
-d '{
"texts": [
"I love this!",
"This is terrible.",
"Neutral statement here."
]
}' | python3 -m json.tool
echo ""
echo "<22>🏥 Final Health Check:"
curl -s -X GET "$API_BASE/health" | python3 -m json.tool
echo ""
echo "✅ Demo completed! Check the API documentation at: $API_BASE/docs"

1069
poetry.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,613 @@
{
"info": {
"_postman_id": "ai-lab-api-collection",
"name": "AI Lab API - Complete Collection",
"description": "Complete Postman collection for AI Lab API with all endpoints for NLP pipelines using transformers",
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json",
"_exporter_id": "ai-lab"
},
"variable": [
{
"key": "base_url",
"value": "http://localhost:8000",
"type": "string",
"description": "Base URL for AI Lab API"
}
],
"item": [
{
"name": "🏠 Core Endpoints",
"item": [
{
"name": "Root - API Information",
"request": {
"method": "GET",
"header": [],
"url": {
"raw": "{{base_url}}/",
"host": ["{{base_url}}"],
"path": [""]
},
"description": "Get API information and available endpoints"
},
"response": []
},
{
"name": "Health Check",
"request": {
"method": "GET",
"header": [],
"url": {
"raw": "{{base_url}}/health",
"host": ["{{base_url}}"],
"path": ["health"]
},
"description": "Check API health status and loaded pipelines"
},
"response": []
}
],
"description": "Core API endpoints for health check and information"
},
{
"name": "💭 Sentiment Analysis",
"item": [
{
"name": "Analyze Sentiment - Positive",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"I absolutely love this project! It's amazing and well-designed.\"\n}"
},
"url": {
"raw": "{{base_url}}/sentiment",
"host": ["{{base_url}}"],
"path": ["sentiment"]
},
"description": "Analyze sentiment of positive text"
},
"response": []
},
{
"name": "Analyze Sentiment - Negative",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"This is terrible and I hate it completely.\"\n}"
},
"url": {
"raw": "{{base_url}}/sentiment",
"host": ["{{base_url}}"],
"path": ["sentiment"]
},
"description": "Analyze sentiment of negative text"
},
"response": []
},
{
"name": "Analyze Sentiment - Custom Model",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"This product is okay, nothing special.\",\n \"model_name\": \"cardiffnlp/twitter-roberta-base-sentiment-latest\"\n}"
},
"url": {
"raw": "{{base_url}}/sentiment",
"host": ["{{base_url}}"],
"path": ["sentiment"]
},
"description": "Analyze sentiment using custom model"
},
"response": []
},
{
"name": "Batch Sentiment Analysis",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"texts\": [\n \"I love this!\",\n \"This is terrible.\",\n \"It's okay, nothing special.\",\n \"Amazing product, highly recommended!\",\n \"Worst experience ever.\"\n ]\n}"
},
"url": {
"raw": "{{base_url}}/sentiment/batch",
"host": ["{{base_url}}"],
"path": ["sentiment", "batch"]
},
"description": "Analyze sentiment for multiple texts"
},
"response": []
}
],
"description": "Sentiment analysis endpoints for analyzing emotional tone"
},
{
"name": "🏷️ Named Entity Recognition",
"item": [
{
"name": "Extract Entities - People & Organizations",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"Elon Musk is the CEO of Tesla and SpaceX. He was born in South Africa and now lives in California.\"\n}"
},
"url": {
"raw": "{{base_url}}/ner",
"host": ["{{base_url}}"],
"path": ["ner"]
},
"description": "Extract named entities from text"
},
"response": []
},
{
"name": "Extract Entities - Geographic",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"The meeting will be held in Paris, France on Monday. We'll then travel to London, United Kingdom.\"\n}"
},
"url": {
"raw": "{{base_url}}/ner",
"host": ["{{base_url}}"],
"path": ["ner"]
},
"description": "Extract geographic entities"
},
"response": []
},
{
"name": "Batch NER Processing",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"texts\": [\n \"Apple Inc. is headquartered in Cupertino, California.\",\n \"Microsoft was founded by Bill Gates and Paul Allen.\",\n \"The conference will be in Tokyo, Japan next month.\"\n ]\n}"
},
"url": {
"raw": "{{base_url}}/ner/batch",
"host": ["{{base_url}}"],
"path": ["ner", "batch"]
},
"description": "Extract entities from multiple texts"
},
"response": []
}
],
"description": "Named Entity Recognition endpoints for extracting people, places, organizations"
},
{
"name": "❓ Question Answering",
"item": [
{
"name": "Simple Q&A",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"question\": \"What is the capital of France?\",\n \"context\": \"France is a country in Europe. Paris is the capital and largest city of France. The city is known for the Eiffel Tower and the Louvre Museum.\"\n}"
},
"url": {
"raw": "{{base_url}}/qa",
"host": ["{{base_url}}"],
"path": ["qa"]
},
"description": "Answer questions based on context"
},
"response": []
},
{
"name": "Technical Q&A",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"question\": \"What programming language is mentioned?\",\n \"context\": \"FastAPI is a modern, fast web framework for building APIs with Python 3.7+. It provides automatic interactive API documentation and is built on top of Starlette and Pydantic.\"\n}"
},
"url": {
"raw": "{{base_url}}/qa",
"host": ["{{base_url}}"],
"path": ["qa"]
},
"description": "Answer technical questions"
},
"response": []
}
],
"description": "Question Answering endpoints for extracting answers from context"
},
{
"name": "🎭 Fill Mask",
"item": [
{
"name": "Fill Simple Mask",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"The capital of France is [MASK].\"\n}"
},
"url": {
"raw": "{{base_url}}/fillmask",
"host": ["{{base_url}}"],
"path": ["fillmask"]
},
"description": "Predict masked words in sentences"
},
"response": []
},
{
"name": "Fill Technical Mask",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"Python is a popular [MASK] language for machine learning.\"\n}"
},
"url": {
"raw": "{{base_url}}/fillmask",
"host": ["{{base_url}}"],
"path": ["fillmask"]
},
"description": "Fill technical context masks"
},
"response": []
},
{
"name": "Batch Fill Mask",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"texts\": [\n \"The weather today is [MASK].\",\n \"I like to eat [MASK] for breakfast.\",\n \"The best programming language is [MASK].\"\n ]\n}"
},
"url": {
"raw": "{{base_url}}/fillmask/batch",
"host": ["{{base_url}}"],
"path": ["fillmask", "batch"]
},
"description": "Fill masks in multiple texts"
},
"response": []
}
],
"description": "Fill Mask endpoints for predicting masked words"
},
{
"name": "🛡️ Content Moderation",
"item": [
{
"name": "Check Safe Content",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"This is a wonderful day and I'm feeling great!\"\n}"
},
"url": {
"raw": "{{base_url}}/moderation",
"host": ["{{base_url}}"],
"path": ["moderation"]
},
"description": "Check safe, non-toxic content"
},
"response": []
},
{
"name": "Check Potentially Toxic Content",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"I hate everything and everyone around me!\"\n}"
},
"url": {
"raw": "{{base_url}}/moderation",
"host": ["{{base_url}}"],
"path": ["moderation"]
},
"description": "Check potentially toxic content"
},
"response": []
},
{
"name": "Batch Content Moderation",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"texts\": [\n \"Have a great day!\",\n \"I'm so angry right now!\",\n \"Thank you for your help.\",\n \"This is completely stupid!\"\n ]\n}"
},
"url": {
"raw": "{{base_url}}/moderation/batch",
"host": ["{{base_url}}"],
"path": ["moderation", "batch"]
},
"description": "Moderate multiple texts for toxicity"
},
"response": []
}
],
"description": "Content Moderation endpoints for detecting toxic or harmful content"
},
{
"name": "✍️ Text Generation",
"item": [
{
"name": "Generate Creative Text",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"Once upon a time in a magical forest\"\n}"
},
"url": {
"raw": "{{base_url}}/textgen",
"host": ["{{base_url}}"],
"path": ["textgen"]
},
"description": "Generate creative text from prompt"
},
"response": []
},
{
"name": "Generate Technical Text",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"FastAPI is a modern Python web framework that\"\n}"
},
"url": {
"raw": "{{base_url}}/textgen",
"host": ["{{base_url}}"],
"path": ["textgen"]
},
"description": "Generate technical documentation text"
},
"response": []
},
{
"name": "Batch Text Generation",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"texts\": [\n \"In the future, AI will\",\n \"The best way to learn programming is\",\n \"Climate change is\"\n ]\n}"
},
"url": {
"raw": "{{base_url}}/textgen/batch",
"host": ["{{base_url}}"],
"path": ["textgen", "batch"]
},
"description": "Generate text from multiple prompts"
},
"response": []
}
],
"description": "Text Generation endpoints for creating text from prompts"
},
{
"name": "🧪 Testing & Examples",
"item": [
{
"name": "Complete Pipeline Test",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"AI Lab is an amazing project for learning NLP!\"\n}"
},
"url": {
"raw": "{{base_url}}/sentiment",
"host": ["{{base_url}}"],
"path": ["sentiment"]
},
"description": "Test with project-related text"
},
"response": []
},
{
"name": "Error Handling Test - Empty Text",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"\"\n}"
},
"url": {
"raw": "{{base_url}}/sentiment",
"host": ["{{base_url}}"],
"path": ["sentiment"]
},
"description": "Test error handling with empty text"
},
"response": []
},
{
"name": "Error Handling Test - Invalid Model",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"text\": \"Test text\",\n \"model_name\": \"non-existent-model\"\n}"
},
"url": {
"raw": "{{base_url}}/sentiment",
"host": ["{{base_url}}"],
"path": ["sentiment"]
},
"description": "Test error handling with invalid model"
},
"response": []
}
],
"description": "Testing endpoints and error handling examples"
}
],
"event": [
{
"listen": "prerequest",
"script": {
"type": "text/javascript",
"exec": [
"// Pre-request script for all requests",
"console.log('Making request to: ' + pm.request.url);",
"",
"// Add timestamp to request",
"pm.globals.set('request_timestamp', new Date().toISOString());"
]
}
},
{
"listen": "test",
"script": {
"type": "text/javascript",
"exec": [
"// Common tests for all requests",
"pm.test('Response time is less than 30 seconds', function () {",
" pm.expect(pm.response.responseTime).to.be.below(30000);",
"});",
"",
"pm.test('Response has Content-Type header', function () {",
" pm.expect(pm.response.headers.get('Content-Type')).to.include('application/json');",
"});",
"",
"// Log response for debugging",
"console.log('Response status:', pm.response.status);",
"console.log('Response time:', pm.response.responseTime + 'ms');"
]
}
}
]
}

View File

@ -0,0 +1,63 @@
{
"id": "ai-lab-api-environment",
"name": "AI Lab API Environment",
"values": [
{
"key": "base_url",
"value": "http://localhost:8000",
"type": "default",
"description": "Base URL for AI Lab API (default: localhost)"
},
{
"key": "api_version",
"value": "1.0.0",
"type": "default",
"description": "API version"
},
{
"key": "timeout",
"value": "30000",
"type": "default",
"description": "Request timeout in milliseconds"
},
{
"key": "default_sentiment_model",
"value": "distilbert-base-uncased-finetuned-sst-2-english",
"type": "default",
"description": "Default sentiment analysis model"
},
{
"key": "default_ner_model",
"value": "dslim/bert-base-NER",
"type": "default",
"description": "Default NER model"
},
{
"key": "default_qa_model",
"value": "distilbert-base-cased-distilled-squad",
"type": "default",
"description": "Default Q&A model"
},
{
"key": "default_fillmask_model",
"value": "bert-base-uncased",
"type": "default",
"description": "Default fill mask model"
},
{
"key": "default_textgen_model",
"value": "gpt2",
"type": "default",
"description": "Default text generation model"
},
{
"key": "default_moderation_model",
"value": "unitary/toxic-bert",
"type": "default",
"description": "Default content moderation model"
}
],
"_postman_variable_scope": "environment",
"_postman_exported_at": "2024-10-12T10:00:00.000Z",
"_postman_exported_using": "Postman/10.18.0"
}

View File

@ -18,6 +18,9 @@ transformers = "^4.30.0"
tokenizers = "^0.13.0"
numpy = "^1.24.0"
accelerate = "^0.20.0"
fastapi = "^0.104.0"
uvicorn = { extras = ["standard"], version = "^0.24.0" }
pydantic = "^2.5.0"
[tool.poetry.scripts]
ai-lab = "src.main:main"

7
src/api/__init__.py Normal file
View File

@ -0,0 +1,7 @@
"""
API module for AI Lab
"""
from .models import *
from .app import app
__all__ = ["app"]

494
src/api/app.py Normal file
View File

@ -0,0 +1,494 @@
"""
FastAPI application for AI Lab
"""
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from contextlib import asynccontextmanager
from typing import Dict, Any
import logging
from .models import (
TextRequest, TextListRequest, QARequest, FillMaskRequest,
SentimentResponse, NERResponse, QAResponse, FillMaskResponse,
ModerationResponse, TextGenResponse, BatchResponse
)
# Global pipeline instances
pipelines: Dict[str, Any] = {}
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Manage application lifespan - load models on startup"""
global pipelines
# Load all pipelines on startup
try:
logging.info("Loading AI pipelines...")
# Import here to avoid circular imports
from src.pipelines.sentiment import SentimentAnalyzer
from src.pipelines.ner import NamedEntityRecognizer
from src.pipelines.qa import QuestionAnsweringSystem
from src.pipelines.fillmask import FillMaskAnalyzer
from src.pipelines.moderation import ContentModerator
from src.pipelines.textgen import TextGenerator
pipelines["sentiment"] = SentimentAnalyzer()
pipelines["ner"] = NamedEntityRecognizer()
pipelines["qa"] = QuestionAnsweringSystem()
pipelines["fillmask"] = FillMaskAnalyzer()
pipelines["moderation"] = ContentModerator()
pipelines["textgen"] = TextGenerator()
logging.info("All pipelines loaded successfully!")
except Exception as e:
logging.error(f"Error loading pipelines: {e}")
# Don't raise, just log - allows API to start without all pipelines
yield
# Cleanup on shutdown
pipelines.clear()
logging.info("Pipelines cleaned up")
# Create FastAPI app
app = FastAPI(
title="AI Lab API",
description="API for various AI/ML pipelines using transformers",
version="1.0.0",
lifespan=lifespan,
swagger_ui_parameters={
"syntaxHighlight.theme": "obsidian",
"tryItOutEnabled": True,
"requestSnippetsEnabled": True,
"persistAuthorization": True,
"displayRequestDuration": True,
"defaultModelRendering": "model"
}
)
# Add CORS middleware
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # Configure appropriately for production
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/")
async def root():
"""Root endpoint"""
return {
"message": "Welcome to AI Lab API",
"version": "1.0.0",
"available_endpoints": [
"/sentiment",
"/ner",
"/qa",
"/fillmask",
"/moderation",
"/textgen",
"/sentiment/batch",
"/ner/batch",
"/fillmask/batch",
"/moderation/batch",
"/textgen/batch",
"/health",
"/docs"
]
}
@app.get("/health")
async def health_check():
"""Health check endpoint"""
return {
"status": "healthy",
"pipelines_loaded": len(pipelines),
"available_pipelines": list(pipelines.keys())
}
@app.post("/sentiment", response_model=SentimentResponse)
async def analyze_sentiment(request: TextRequest):
"""Analyze sentiment of a text"""
try:
if "sentiment" not in pipelines:
raise HTTPException(status_code=503, detail="Sentiment pipeline not available")
# Use custom model if provided
if request.model_name:
from src.pipelines.sentiment import SentimentAnalyzer
analyzer = SentimentAnalyzer(request.model_name)
result = analyzer.analyze(request.text)
else:
result = pipelines["sentiment"].analyze(request.text)
if "error" in result:
return SentimentResponse(success=False, text=request.text, message=result["error"])
return SentimentResponse(
success=True,
text=result["text"],
sentiment=result["sentiment"],
confidence=result["confidence"]
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/ner", response_model=NERResponse)
async def extract_entities(request: TextRequest):
"""Extract named entities from text"""
try:
logging.info(f"NER request for text: {request.text[:50]}...")
if "ner" not in pipelines:
raise HTTPException(status_code=503, detail="NER pipeline not available")
try:
if request.model_name:
from src.pipelines.ner import NamedEntityRecognizer
ner = NamedEntityRecognizer(request.model_name)
result = ner.recognize(request.text)
else:
result = pipelines["ner"].recognize(request.text)
except Exception as pipeline_error:
logging.error(f"Pipeline error: {str(pipeline_error)}")
return NERResponse(success=False, text=request.text, message=f"Pipeline error: {str(pipeline_error)}")
logging.info(f"NER result keys: {list(result.keys())}")
if "error" in result:
logging.error(f"NER error: {result['error']}")
return NERResponse(success=False, text=request.text, message=result["error"])
# Validate result structure
if "original_text" not in result:
logging.error(f"Missing 'original_text' in result: {result}")
return NERResponse(success=False, text=request.text, message="Invalid NER result format")
if "entities" not in result:
logging.error(f"Missing 'entities' in result: {result}")
return NERResponse(success=False, text=request.text, message="Invalid NER result format")
return NERResponse(
success=True,
text=result["original_text"],
entities=result["entities"]
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/qa", response_model=QAResponse)
async def answer_question(request: QARequest):
"""Answer a question based on context"""
try:
if "qa" not in pipelines:
raise HTTPException(status_code=503, detail="QA pipeline not available")
if request.model_name:
from src.pipelines.qa import QuestionAnsweringSystem
qa = QuestionAnsweringSystem(request.model_name)
result = qa.answer(request.question, request.context)
else:
result = pipelines["qa"].answer(request.question, request.context)
if "error" in result:
return QAResponse(
success=False,
question=request.question,
context=request.context,
message=result["error"]
)
return QAResponse(
success=True,
question=result["question"],
context=result["context"],
answer=result["answer"],
confidence=result["confidence"]
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/fillmask", response_model=FillMaskResponse)
async def fill_mask(request: FillMaskRequest):
"""Fill masked words in text"""
try:
if "fillmask" not in pipelines:
raise HTTPException(status_code=503, detail="Fill-mask pipeline not available")
if request.model_name:
from src.pipelines.fillmask import FillMaskAnalyzer
fillmask = FillMaskAnalyzer(request.model_name)
result = fillmask.predict(request.text)
else:
result = pipelines["fillmask"].predict(request.text)
if "error" in result:
return FillMaskResponse(success=False, text=request.text, message=result["error"])
return FillMaskResponse(
success=True,
text=result["original_text"],
predictions=result["predictions"]
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/moderation", response_model=ModerationResponse)
async def moderate_content(request: TextRequest):
"""Moderate content for inappropriate material"""
try:
if "moderation" not in pipelines:
raise HTTPException(status_code=503, detail="Moderation pipeline not available")
if request.model_name:
from src.pipelines.moderation import ContentModerator
moderation = ContentModerator(request.model_name)
result = moderation.moderate(request.text)
else:
result = pipelines["moderation"].moderate(request.text)
if "error" in result:
return ModerationResponse(success=False, text=request.text, message=result["error"])
# Map the result fields correctly
flagged = result.get("is_modified", False) or result.get("toxic_score", 0.0) > 0.5
categories = {
"toxic_score": result.get("toxic_score", 0.0),
"is_modified": result.get("is_modified", False),
"restored_text": result.get("moderated_text", request.text),
"words_replaced": result.get("words_replaced", 0)
}
return ModerationResponse(
success=True,
text=result["original_text"],
flagged=flagged,
categories=categories
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/textgen", response_model=TextGenResponse)
async def generate_text(request: TextRequest):
"""Generate text from a prompt"""
try:
if "textgen" not in pipelines:
raise HTTPException(status_code=503, detail="Text generation pipeline not available")
logging.info(f"Generating text for prompt: {request.text[:50]}...")
if request.model_name:
from src.pipelines.textgen import TextGenerator
textgen = TextGenerator(request.model_name)
result = textgen.generate(request.text)
else:
result = pipelines["textgen"].generate(request.text)
logging.info(f"Generation result keys: {list(result.keys())}")
if "error" in result:
logging.error(f"Generation error: {result['error']}")
return TextGenResponse(success=False, prompt=request.text, message=result["error"])
# Extract the generated text from the first generation
generated_text = ""
if "generations" in result and len(result["generations"]) > 0:
# Get the continuation (text after the prompt) from the first generation
generated_text = result["generations"][0].get("continuation", "")
logging.info(f"Extracted generated text: {generated_text[:100]}...")
else:
logging.warning("No generations found in result")
return TextGenResponse(
success=True,
prompt=result["prompt"],
generated_text=result["prompt"] + " " + generated_text
)
except Exception as e:
logging.error(f"TextGen endpoint error: {str(e)}", exc_info=True)
raise HTTPException(status_code=500, detail=str(e))
# Batch processing endpoints
@app.post("/sentiment/batch", response_model=BatchResponse)
async def analyze_sentiment_batch(request: TextListRequest):
"""Analyze sentiment for multiple texts"""
try:
if "sentiment" not in pipelines:
raise HTTPException(status_code=503, detail="Sentiment pipeline not available")
analyzer = pipelines["sentiment"]
if request.model_name:
from src.pipelines.sentiment import SentimentAnalyzer
analyzer = SentimentAnalyzer(request.model_name)
results = []
failed_count = 0
for text in request.texts:
try:
result = analyzer.analyze(text)
if "error" in result:
failed_count += 1
results.append(result)
except Exception as e:
failed_count += 1
results.append({"text": text, "error": str(e)})
return BatchResponse(
success=True,
results=results,
processed_count=len(request.texts),
failed_count=failed_count
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/ner/batch", response_model=BatchResponse)
async def extract_entities_batch(request: TextListRequest):
"""Extract entities from multiple texts"""
try:
if "ner" not in pipelines:
raise HTTPException(status_code=503, detail="NER pipeline not available")
ner = pipelines["ner"]
if request.model_name:
from src.pipelines.ner import NamedEntityRecognizer
ner = NamedEntityRecognizer(request.model_name)
results = []
failed_count = 0
for text in request.texts:
try:
result = ner.recognize(text)
if "error" in result:
failed_count += 1
results.append(result)
except Exception as e:
failed_count += 1
results.append({"text": text, "error": str(e)})
return BatchResponse(
success=True,
results=results,
processed_count=len(request.texts),
failed_count=failed_count
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/fillmask/batch", response_model=BatchResponse)
async def fill_mask_batch(request: TextListRequest):
"""Fill masked words in multiple texts"""
try:
if "fillmask" not in pipelines:
raise HTTPException(status_code=503, detail="Fill-mask pipeline not available")
fillmask = pipelines["fillmask"]
if request.model_name:
from src.pipelines.fillmask import FillMaskAnalyzer
fillmask = FillMaskAnalyzer(request.model_name)
results = []
failed_count = 0
for text in request.texts:
try:
result = fillmask.predict(text)
if "error" in result:
failed_count += 1
results.append(result)
except Exception as e:
failed_count += 1
results.append({"text": text, "error": str(e)})
return BatchResponse(
success=True,
results=results,
processed_count=len(request.texts),
failed_count=failed_count
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/moderation/batch", response_model=BatchResponse)
async def moderate_content_batch(request: TextListRequest):
"""Moderate multiple texts for inappropriate content"""
try:
if "moderation" not in pipelines:
raise HTTPException(status_code=503, detail="Moderation pipeline not available")
moderation = pipelines["moderation"]
if request.model_name:
from src.pipelines.moderation import ContentModerator
moderation = ContentModerator(request.model_name)
results = []
failed_count = 0
for text in request.texts:
try:
result = moderation.moderate(text)
if "error" in result:
failed_count += 1
results.append(result)
except Exception as e:
failed_count += 1
results.append({"text": text, "error": str(e)})
return BatchResponse(
success=True,
results=results,
processed_count=len(request.texts),
failed_count=failed_count
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/textgen/batch", response_model=BatchResponse)
async def generate_text_batch(request: TextListRequest):
"""Generate text from multiple prompts"""
try:
if "textgen" not in pipelines:
raise HTTPException(status_code=503, detail="Text generation pipeline not available")
textgen = pipelines["textgen"]
if request.model_name:
from src.pipelines.textgen import TextGenerator
textgen = TextGenerator(request.model_name)
results = []
failed_count = 0
for text in request.texts:
try:
result = textgen.generate(text)
if "error" in result:
failed_count += 1
results.append(result)
except Exception as e:
failed_count += 1
results.append({"text": text, "error": str(e)})
return BatchResponse(
success=True,
results=results,
processed_count=len(request.texts),
failed_count=failed_count
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))

46
src/api/config.py Normal file
View File

@ -0,0 +1,46 @@
"""
API configuration settings
"""
from typing import Dict, Any
from src.config.settings import Config
class APIConfig:
"""Configuration for the FastAPI application"""
# Server settings
DEFAULT_HOST = "127.0.0.1"
DEFAULT_PORT = 8000
# API settings
API_TITLE = "AI Lab API"
API_DESCRIPTION = "API for various AI/ML pipelines using transformers"
API_VERSION = "1.0.0"
# CORS settings
CORS_ORIGINS = ["*"] # Configure for production
CORS_METHODS = ["*"]
CORS_HEADERS = ["*"]
# Pipeline settings
MAX_TEXT_LENGTH = 10000
MAX_BATCH_SIZE = 100
@classmethod
def get_all_settings(cls) -> Dict[str, Any]:
"""Get all configuration settings"""
return {
"server": {
"default_host": cls.DEFAULT_HOST,
"default_port": cls.DEFAULT_PORT
},
"api": {
"title": cls.API_TITLE,
"description": cls.API_DESCRIPTION,
"version": cls.API_VERSION
},
"limits": {
"max_text_length": cls.MAX_TEXT_LENGTH,
"max_batch_size": cls.MAX_BATCH_SIZE
},
}

91
src/api/models.py Normal file
View File

@ -0,0 +1,91 @@
"""
Pydantic models for API requests and responses
"""
from pydantic import BaseModel
from typing import List, Optional, Dict, Any
# Request models
class TextRequest(BaseModel):
"""Base request model for single text input"""
text: str
model_name: Optional[str] = None
class TextListRequest(BaseModel):
"""Request model for multiple texts"""
texts: List[str]
model_name: Optional[str] = None
class QARequest(BaseModel):
"""Request model for question answering"""
question: str
context: str
model_name: Optional[str] = None
class FillMaskRequest(BaseModel):
"""Request model for fill mask task"""
text: str
model_name: Optional[str] = None
# Response models
class BaseResponse(BaseModel):
"""Base response model"""
success: bool
message: Optional[str] = None
class SentimentResponse(BaseResponse):
"""Response model for sentiment analysis"""
text: str
sentiment: Optional[str] = None
confidence: Optional[float] = None
class NERResponse(BaseResponse):
"""Response model for Named Entity Recognition"""
text: str
entities: Optional[List[Dict[str, Any]]] = None
class QAResponse(BaseResponse):
"""Response model for Question Answering"""
question: str
context: str
answer: Optional[str] = None
confidence: Optional[float] = None
class FillMaskResponse(BaseResponse):
"""Response model for Fill Mask"""
text: str
predictions: Optional[List[Dict[str, Any]]] = None
class ModerationResponse(BaseResponse):
"""Response model for Content Moderation"""
text: str
flagged: Optional[bool] = None
categories: Optional[Dict[str, Any]] = None
class TextGenResponse(BaseResponse):
"""Response model for Text Generation"""
prompt: str
generated_text: Optional[str] = None
class BatchResponse(BaseResponse):
"""Response model for batch processing"""
results: List[Dict[str, Any]]
processed_count: int
failed_count: int
class ErrorResponse(BaseResponse):
"""Response model for errors"""
error: str
details: Optional[str] = None

View File

@ -190,3 +190,78 @@ class DisplayFormatter:
output.append(f"{entity} ({count}x)")
return "\n".join(output)
@staticmethod
def format_qa_result(result: Dict[str, Any]) -> str:
"""Format Question Answering result for display"""
if "error" in result:
return f"{result['error']}"
output = []
output.append(f"❓ Question: {result['question']}")
# Confidence indicator
confidence = result['confidence']
confidence_emoji = "" if result['is_confident'] else "⚠️"
confidence_bar = "" * int(confidence * 10)
output.append(f"{confidence_emoji} Answer: {result['answer']}")
output.append(f"📊 Confidence: {result['confidence_level']} ({confidence:.1%}) {confidence_bar}")
if not result['is_confident']:
output.append("⚠️ Low confidence - answer might not be reliable")
output.append(f"\n📍 Position: characters {result['start_position']}-{result['end_position']}")
output.append(f"📄 Context with answer highlighted:")
output.append(f" {result['highlighted_context']}")
return "\n".join(output)
@staticmethod
def format_qa_context_analysis(analysis: Dict[str, Any]) -> str:
"""Format QA context analysis for display"""
if "error" in analysis:
return f"{analysis['error']}"
output = []
output.append("✅ Context set successfully!")
output.append(f"📊 Context Statistics:")
stats = analysis['context_stats']
output.append(f" • Words: {stats['word_count']}")
output.append(f" • Sentences: ~{stats['sentence_count']}")
output.append(f" • Characters: {stats['character_count']}")
if analysis['suggested_questions']:
output.append(f"\n💡 Suggested question types:")
for suggestion in analysis['suggested_questions']:
output.append(f"{suggestion}")
if analysis['tips']:
output.append(f"\n📝 Tips for good questions:")
for tip in analysis['tips']:
output.append(f"{tip}")
return "\n".join(output)
@staticmethod
def format_qa_multiple_result(result: Dict[str, Any]) -> str:
"""Format multiple QA results for display"""
if "error" in result:
return f"{result['error']}"
output = []
output.append(f"📊 Multiple Questions Analysis")
output.append("=" * 50)
output.append(f"Total Questions: {result['total_questions']}")
output.append(f"Successfully Processed: {result['processed_questions']}")
output.append(f"Confident Answers: {result['confident_answers']}")
output.append(f"Average Confidence: {result['average_confidence']:.1%}")
output.append(f"\n📋 Results:")
for qa_result in result['results']:
confidence_emoji = "" if qa_result['is_confident'] else "⚠️"
output.append(f"\n{qa_result['question_number']}. {qa_result['question']}")
output.append(f" {confidence_emoji} {qa_result['answer']} ({qa_result['confidence']:.1%})")
return "\n".join(output)

View File

@ -6,5 +6,6 @@ from .fillmask import FillMaskCommand
from .textgen import TextGenCommand
from .moderation import ModerationCommand
from .ner import NERCommand
from .qa import QACommand
__all__ = ['SentimentCommand', 'FillMaskCommand', 'TextGenCommand', 'ModerationCommand', 'NERCommand']
__all__ = ['SentimentCommand', 'FillMaskCommand', 'TextGenCommand', 'ModerationCommand', 'NERCommand', 'QACommand']

214
src/commands/qa.py Normal file
View File

@ -0,0 +1,214 @@
from src.cli.base import CLICommand
from src.cli.display import DisplayFormatter
from src.pipelines.qa import QuestionAnsweringSystem
class QACommand(CLICommand):
"""Interactive Question Answering command"""
def __init__(self):
self.qa_system = None
self.current_context = None
self.session_questions = []
@property
def name(self) -> str:
return "qa"
@property
def description(self) -> str:
return "Question Answering - Ask questions about a given text"
def _initialize_qa_system(self):
"""Lazy initialization of the QA system"""
if self.qa_system is None:
print("🔄 Loading Question Answering model...")
self.qa_system = QuestionAnsweringSystem()
DisplayFormatter.show_success("QA model loaded!")
def _show_instructions(self):
"""Show usage instructions and examples"""
print("\n❓ Question Answering System")
print("Ask questions about a text context and get precise answers.")
print("\n📝 How it works:")
print(" 1. First, provide a context (text containing information)")
print(" 2. Then ask questions about that context")
print(" 3. The system extracts answers directly from the text")
print("\n💡 Example context:")
print(" 'Albert Einstein was born in 1879 in Germany. He developed the theory of relativity.'")
print("💡 Example questions:")
print(" - When was Einstein born?")
print(" - Where was Einstein born?")
print(" - What theory did Einstein develop?")
print("\n🎛️ Commands:")
print(" 'back' - Return to main menu")
print(" 'help' - Show these instructions")
print(" 'context' - Set new context")
print(" 'multi' - Ask multiple questions at once")
print(" 'session' - Review session history")
print(" 'settings' - Adjust confidence threshold")
print("-" * 70)
def _set_context(self):
"""Allow user to set or change the context"""
print("\n📄 Set Context")
print("Enter the text that will serve as context for your questions.")
print("You can enter multiple lines. Type 'done' when finished.")
print("-" * 50)
lines = []
while True:
line = input("📝 ").strip()
if line.lower() == 'done':
break
if line:
lines.append(line)
if not lines:
DisplayFormatter.show_warning("No context provided")
return False
self.current_context = " ".join(lines)
# Analyze context
analysis = self.qa_system.interactive_qa(self.current_context)
if "error" in analysis:
DisplayFormatter.show_error(analysis["error"])
return False
formatted_analysis = DisplayFormatter.format_qa_context_analysis(analysis)
print(formatted_analysis)
return True
def _ask_single_question(self):
"""Ask a single question about the current context"""
if not self.current_context:
DisplayFormatter.show_warning("Please set a context first using 'context' command")
return
question = input("\n❓ Your question: ").strip()
if not question:
DisplayFormatter.show_warning("Please enter a question")
return
DisplayFormatter.show_loading("Finding answer...")
result = self.qa_system.answer(question, self.current_context)
if "error" not in result:
self.session_questions.append(result)
formatted_result = DisplayFormatter.format_qa_result(result)
print(formatted_result)
def _multi_question_mode(self):
"""Allow asking multiple questions at once"""
if not self.current_context:
DisplayFormatter.show_warning("Please set a context first using 'context' command")
return
print("\n❓ Multiple Questions Mode")
print("Enter your questions one by one. Type 'done' when finished.")
print("-" * 50)
questions = []
while True:
question = input(f"Question #{len(questions)+1}: ").strip()
if question.lower() == 'done':
break
if question:
questions.append(question)
if not questions:
DisplayFormatter.show_warning("No questions provided")
return
DisplayFormatter.show_loading(f"Processing {len(questions)} questions...")
result = self.qa_system.answer_multiple(questions, self.current_context)
if "error" not in result:
self.session_questions.extend(result["results"])
formatted_result = DisplayFormatter.format_qa_multiple_result(result)
print(formatted_result)
def _show_session_history(self):
"""Show the history of questions asked in this session"""
if not self.session_questions:
DisplayFormatter.show_warning("No questions asked in this session yet")
return
print(f"\n📚 Session History ({len(self.session_questions)} questions)")
print("=" * 60)
for i, qa in enumerate(self.session_questions, 1):
confidence_emoji = "" if qa["is_confident"] else "⚠️"
print(f"\n{i}. {qa['question']}")
print(f" {confidence_emoji} {qa['answer']} (confidence: {qa['confidence']:.1%})")
def _adjust_settings(self):
"""Allow user to adjust QA settings"""
current_threshold = self.qa_system.confidence_threshold
print(f"\n⚙️ Current Settings:")
print(f"Confidence threshold: {current_threshold:.2f}")
print("\nLower threshold = more answers accepted (less strict)")
print("Higher threshold = fewer answers accepted (more strict)")
try:
new_threshold = input(f"Enter new threshold (0.0-1.0, current: {current_threshold}): ").strip()
if new_threshold:
threshold = float(new_threshold)
self.qa_system.set_confidence_threshold(threshold)
DisplayFormatter.show_success(f"Threshold set to {threshold:.2f}")
except ValueError:
DisplayFormatter.show_error("Invalid threshold value")
def run(self):
"""Run interactive Question Answering"""
self._initialize_qa_system()
self._show_instructions()
while True:
if self.current_context:
context_preview = (self.current_context[:50] + "...") if len(self.current_context) > 50 else self.current_context
prompt = f"\n💬 [{context_preview}] Ask a question: "
else:
prompt = "\n💬 Enter command or set context first: "
user_input = input(prompt).strip()
if user_input.lower() == 'back':
break
elif user_input.lower() == 'help':
self._show_instructions()
continue
elif user_input.lower() == 'context':
self._set_context()
continue
elif user_input.lower() == 'multi':
self._multi_question_mode()
continue
elif user_input.lower() == 'session':
self._show_session_history()
continue
elif user_input.lower() == 'settings':
self._adjust_settings()
continue
if not user_input:
DisplayFormatter.show_warning("Please enter a question or command")
continue
# If we have a context and user input is not a command, treat it as a question
if self.current_context:
DisplayFormatter.show_loading("Finding answer...")
result = self.qa_system.answer(user_input, self.current_context)
if "error" not in result:
self.session_questions.append(result)
formatted_result = DisplayFormatter.format_qa_result(result)
print(formatted_result)
else:
DisplayFormatter.show_warning("Please set a context first using 'context' command")

View File

@ -3,6 +3,7 @@ Global project configuration
"""
from pathlib import Path
from typing import Dict, Any
import torch
class Config:
@ -14,11 +15,12 @@ class Config:
# Default models
DEFAULT_MODELS = {
"sentiment": "cardiffnlp/twitter-roberta-base-sentiment-latest",
"fillmask": "distilbert-base-uncased",
"sentiment": "distilbert-base-uncased-finetuned-sst-2-english",
"fillmask": "bert-base-uncased",
"textgen": "gpt2",
"moderation": "unitary/toxic-bert",
"ner": "dbmdz/bert-large-cased-finetuned-conll03-english",
"ner": "dslim/bert-base-NER",
"moderation":"unitary/toxic-bert",
"qa": "distilbert-base-cased-distilled-squad",
}
# Interface
@ -28,6 +30,7 @@ class Config:
# Performance
MAX_BATCH_SIZE = 32
DEFAULT_MAX_LENGTH = 512
USE_GPU = torch.cuda.is_available() # Auto-detect GPU availability
@classmethod
def get_model(cls, pipeline_name: str) -> str:

View File

@ -1,8 +1,9 @@
#!/usr/bin/env python3
"""
CLI entry point for AI Lab
Entry point for AI Lab - supports both CLI and API modes
"""
import sys
import argparse
from pathlib import Path
# Add parent directory to PYTHONPATH
@ -13,13 +14,14 @@ from src.commands import (
FillMaskCommand,
ModerationCommand,
NERCommand,
QACommand,
SentimentCommand,
TextGenCommand,
)
def main():
"""Main CLI function"""
def run_cli():
"""Run the CLI interface"""
try:
# Create CLI interface
cli = InteractiveCLI()
@ -31,6 +33,7 @@ def main():
TextGenCommand,
ModerationCommand,
NERCommand,
QACommand,
]
for command in commands_to_register:
cli.register_command(command())
@ -39,11 +42,100 @@ def main():
cli.run()
except KeyboardInterrupt:
print("\n👋 Stopping program")
print("\n👋 Stopping CLI")
except Exception as e:
print(f"Error: {e}")
print(f"CLI Error: {e}")
sys.exit(1)
def run_api(host: str = "127.0.0.1", port: int = 8000, reload: bool = False):
"""Run the FastAPI server"""
try:
import uvicorn
print(f"🚀 Starting AI Lab API server...")
print(f"📡 Server will be available at: http://{host}:{port}")
print(f"📚 API documentation: http://{host}:{port}/docs")
print(f"🔄 Reload mode: {'enabled' if reload else 'disabled'}")
# Load the main FastAPI application
try:
from src.api.app import app
app_module = "src.api.app:app"
print("📊 Loading AI Lab API with all pipelines")
except ImportError as e:
print(f"❌ Error: Could not load API application: {e}")
print("<EFBFBD> Make sure FastAPI dependencies are installed:")
print(" poetry add fastapi uvicorn[standard] pydantic")
sys.exit(1)
uvicorn.run(
app_module,
host=host,
port=port,
reload=reload,
log_level="info"
)
except ImportError:
print("❌ FastAPI dependencies not installed. Please run: pip install fastapi uvicorn")
print("Or with poetry: poetry add fastapi uvicorn[standard]")
sys.exit(1)
except Exception as e:
print(f"❌ API Error: {e}")
sys.exit(1)
def main():
"""Main entry point with argument parsing"""
parser = argparse.ArgumentParser(
description="AI Lab - CLI and API for AI/ML pipelines",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
%(prog)s # Run CLI interface (default)
%(prog)s --mode cli # Run CLI interface explicitly
%(prog)s --mode api # Run API server
%(prog)s --mode api --port 8080 # Run API server on port 8080
%(prog)s --mode api --reload # Run API server with auto-reload
"""
)
parser.add_argument(
"--mode",
choices=["cli", "api"],
default="cli",
help="Choose between CLI or API mode (default: cli)"
)
# API specific arguments
parser.add_argument(
"--host",
default="127.0.0.1",
help="API server host (default: 127.0.0.1)"
)
parser.add_argument(
"--port",
type=int,
default=8000,
help="API server port (default: 8000)"
)
parser.add_argument(
"--reload",
action="store_true",
help="Enable auto-reload for API development"
)
args = parser.parse_args()
if args.mode == "cli":
print("🖥️ Starting CLI mode...")
run_cli()
elif args.mode == "api":
print("🌐 Starting API mode...")
run_api(host=args.host, port=args.port, reload=args.reload)
if __name__ == "__main__":
main()

View File

@ -6,6 +6,7 @@ from .fillmask import FillMaskAnalyzer
from .textgen import TextGenerator
from .moderation import ContentModerator
from .ner import NamedEntityRecognizer
from .qa import QuestionAnsweringSystem
from .template import TemplatePipeline
__all__ = ['SentimentAnalyzer', 'FillMaskAnalyzer', 'TextGenerator', 'ContentModerator', 'NamedEntityRecognizer', 'TemplatePipeline']
__all__ = ['SentimentAnalyzer', 'FillMaskAnalyzer', 'TextGenerator', 'ContentModerator', 'NamedEntityRecognizer', 'QuestionAnsweringSystem', 'TemplatePipeline']

View File

@ -46,7 +46,7 @@ class FillMaskAnalyzer:
mask_predictions = [
{
"token": pred["token_str"],
"score": round(pred["score"], 4),
"score": round(float(pred["score"]), 4),
"sequence": pred["sequence"]
}
for pred in mask_results
@ -66,7 +66,7 @@ class FillMaskAnalyzer:
predictions = [
{
"token": pred["token_str"],
"score": round(pred["score"], 4),
"score": round(float(pred["score"]), 4),
"sequence": pred["sequence"]
}
for pred in results

View File

@ -70,7 +70,7 @@ class ContentModerator:
"original_text": text,
"moderated_text": text,
"is_modified": False,
"toxic_score": toxic_score,
"toxic_score": float(toxic_score),
"words_replaced": 0
}
@ -81,8 +81,8 @@ class ContentModerator:
"original_text": text,
"moderated_text": moderated_text,
"is_modified": True,
"toxic_score": toxic_score,
"words_replaced": words_replaced
"toxic_score": float(toxic_score),
"words_replaced": int(words_replaced)
}
except Exception as e:

View File

@ -58,9 +58,9 @@ class NamedEntityRecognizer:
processed_entity = {
"text": entity["word"],
"label": entity_type,
"confidence": round(entity["score"], 4),
"start": entity["start"],
"end": entity["end"],
"confidence": round(float(entity["score"]), 4),
"start": int(entity["start"]),
"end": int(entity["end"]),
"emoji": self.entity_colors.get(entity_type, "🏷️")
}
@ -70,7 +70,7 @@ class NamedEntityRecognizer:
if entity_type not in entity_stats:
entity_stats[entity_type] = {"count": 0, "entities": []}
entity_stats[entity_type]["count"] += 1
entity_stats[entity_type]["entities"].append(entity["word"])
entity_stats[entity_type]["entities"].append(str(entity["word"]))
# Create highlighted text
highlighted_text = self._highlight_entities(text, filtered_entities)
@ -81,7 +81,7 @@ class NamedEntityRecognizer:
"entities": filtered_entities,
"entity_stats": entity_stats,
"total_entities": len(filtered_entities),
"confidence_threshold": confidence_threshold
"confidence_threshold": float(confidence_threshold)
}
except Exception as e:

266
src/pipelines/qa.py Normal file
View File

@ -0,0 +1,266 @@
from transformers import pipeline
from typing import Dict, List, Optional, Tuple
from src.config import Config
import re
class QuestionAnsweringSystem:
"""Question Answering system using transformers"""
def __init__(self, model_name: Optional[str] = None):
"""
Initialize the question-answering pipeline
Args:
model_name: Name of the model to use (optional)
"""
self.model_name = model_name or Config.get_model("qa")
print(f"Loading Question Answering model: {self.model_name}")
self.pipeline = pipeline("question-answering", model=self.model_name)
print("QA model loaded successfully!")
# Default confidence threshold
self.confidence_threshold = 0.1
def answer(self, question: str, context: str, max_answer_len: int = 50) -> Dict:
"""
Answer a question based on the given context
Args:
question: Question to answer
context: Context text containing the answer
max_answer_len: Maximum length of the answer
Returns:
Dictionary with answer, score, and position information
"""
if not question.strip():
return {"error": "Empty question"}
if not context.strip():
return {"error": "Empty context"}
try:
result = self.pipeline(
question=question,
context=context,
max_answer_len=max_answer_len
)
confidence_level = self._get_confidence_level(result["score"])
highlighted_context = self._highlight_answer_in_context(
context, result["answer"], result["start"], result["end"]
)
return {
"question": question,
"context": context,
"answer": result["answer"],
"confidence": round(result["score"], 4),
"confidence_level": confidence_level,
"start_position": result["start"],
"end_position": result["end"],
"highlighted_context": highlighted_context,
"is_confident": result["score"] >= self.confidence_threshold
}
except Exception as e:
return {"error": f"QA processing error: {str(e)}"}
def _get_confidence_level(self, score: float) -> str:
"""
Convert numerical score to confidence level
Args:
score: Confidence score (0-1)
Returns:
Confidence level description
"""
if score >= 0.8:
return "Very High"
elif score >= 0.6:
return "High"
elif score >= 0.4:
return "Medium"
elif score >= 0.2:
return "Low"
else:
return "Very Low"
def _highlight_answer_in_context(self, context: str, answer: str, start: int, end: int) -> str:
"""
Highlight the answer within the context
Args:
context: Original context
answer: Extracted answer
start: Start position of answer
end: End position of answer
Returns:
Context with highlighted answer
"""
if start < 0 or end > len(context):
return context
before = context[:start]
highlighted_answer = f"**{answer}**"
after = context[end:]
return before + highlighted_answer + after
def answer_multiple(self, questions: List[str], context: str, max_answer_len: int = 50) -> Dict:
"""
Answer multiple questions for the same context
Args:
questions: List of questions to answer
context: Context text
max_answer_len: Maximum length of answers
Returns:
Dictionary with all answers and summary statistics
"""
if not questions:
return {"error": "No questions provided"}
if not context.strip():
return {"error": "Empty context"}
results = []
confident_answers = 0
total_confidence = 0
for i, question in enumerate(questions, 1):
result = self.answer(question, context, max_answer_len)
if "error" not in result:
results.append({
"question_number": i,
**result
})
if result["is_confident"]:
confident_answers += 1
total_confidence += result["confidence"]
if not results:
return {"error": "No valid questions processed"}
average_confidence = total_confidence / len(results) if results else 0
return {
"context": context,
"total_questions": len(questions),
"processed_questions": len(results),
"confident_answers": confident_answers,
"average_confidence": round(average_confidence, 4),
"confidence_threshold": self.confidence_threshold,
"results": results
}
def interactive_qa(self, context: str) -> Dict:
"""
Prepare context for interactive Q&A session
Args:
context: Context text for questions
Returns:
Context analysis and preparation info
"""
if not context.strip():
return {"error": "Empty context"}
# Basic context analysis
word_count = len(context.split())
sentence_count = len([s for s in context.split('.') if s.strip()])
char_count = len(context)
# Suggest question types based on content
suggested_questions = self._generate_question_suggestions(context)
return {
"context": context,
"context_stats": {
"word_count": word_count,
"sentence_count": sentence_count,
"character_count": char_count
},
"suggested_questions": suggested_questions,
"tips": [
"Ask specific questions about facts mentioned in the text",
"Use question words: Who, What, When, Where, Why, How",
"Keep questions clear and focused",
"The answer should be present in the provided context"
]
}
def _generate_question_suggestions(self, context: str) -> List[str]:
"""
Generate suggested questions based on context analysis
Args:
context: Context text
Returns:
List of suggested question templates
"""
suggestions = []
# Check for common patterns and suggest relevant questions
if re.search(r'\b\d{4}\b', context): # Years
suggestions.append("When did [event] happen?")
if re.search(r'\b[A-Z][a-z]+ [A-Z][a-z]+\b', context): # Names
suggestions.append("Who is [person name]?")
if re.search(r'\b(founded|created|established|built)\b', context, re.IGNORECASE):
suggestions.append("Who founded/created [organization]?")
if re.search(r'\b(located|situated|based)\b', context, re.IGNORECASE):
suggestions.append("Where is [place/organization] located?")
if re.search(r'\b(because|due to|reason)\b', context, re.IGNORECASE):
suggestions.append("Why did [event] happen?")
if re.search(r'\b(how|method|process)\b', context, re.IGNORECASE):
suggestions.append("How does [process] work?")
if not suggestions:
suggestions = [
"What is the main topic of this text?",
"Who are the key people mentioned?",
"What important events are described?"
]
return suggestions[:5] # Limit to 5 suggestions
def set_confidence_threshold(self, threshold: float):
"""
Set the confidence threshold for answers
Args:
threshold: Threshold between 0 and 1
"""
if 0 <= threshold <= 1:
self.confidence_threshold = threshold
else:
raise ValueError("Threshold must be between 0 and 1")
def answer_batch(self, qa_pairs: List[Tuple[str, str]], max_answer_len: int = 50) -> List[Dict]:
"""
Process multiple question-context pairs
Args:
qa_pairs: List of (question, context) tuples
max_answer_len: Maximum length of answers
Returns:
List of QA results
"""
return [
self.answer(question, context, max_answer_len)
for question, context in qa_pairs
]

View File

@ -15,17 +15,29 @@ class TextGenerator:
"""
self.model_name = model_name or Config.get_model("textgen")
print(f"Loading text generation model: {self.model_name}")
self.pipeline = pipeline("text-generation", model=self.model_name)
# Initialize pipeline with proper device configuration
self.pipeline = pipeline(
"text-generation",
model=self.model_name,
device=0 if Config.USE_GPU else -1,
torch_dtype="auto"
)
# Set pad token if not available
if self.pipeline.tokenizer.pad_token is None:
self.pipeline.tokenizer.pad_token = self.pipeline.tokenizer.eos_token
print("Model loaded successfully!")
def generate(self, prompt: str, max_length: int = 100, num_return_sequences: int = 1,
def generate(self, prompt: str, max_new_tokens: int = 100, num_return_sequences: int = 1,
temperature: float = 1.0, do_sample: bool = True) -> Dict:
"""
Generate text from a prompt
Args:
prompt: Input text prompt
max_length: Maximum length of generated text
max_new_tokens: Maximum number of new tokens to generate
num_return_sequences: Number of sequences to generate
temperature: Sampling temperature (higher = more random)
do_sample: Whether to use sampling
@ -39,11 +51,12 @@ class TextGenerator:
try:
results = self.pipeline(
prompt,
max_length=max_length,
max_new_tokens=max_new_tokens,
num_return_sequences=num_return_sequences,
temperature=temperature,
do_sample=do_sample,
pad_token_id=self.pipeline.tokenizer.eos_token_id
pad_token_id=self.pipeline.tokenizer.eos_token_id,
return_full_text=True
)
generations = [
@ -57,7 +70,7 @@ class TextGenerator:
return {
"prompt": prompt,
"parameters": {
"max_length": max_length,
"max_new_tokens": max_new_tokens,
"num_sequences": num_return_sequences,
"temperature": temperature,
"do_sample": do_sample

358
ui/index.html Normal file
View File

@ -0,0 +1,358 @@
<!DOCTYPE html>
<html lang="fr">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AI Lab - Interface de Test</title>
<link rel="stylesheet" href="style.css">
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap" rel="stylesheet">
</head>
<body>
<div class="app">
<!-- Header -->
<header class="header">
<div class="container">
<div class="header-content">
<div class="logo">
<h1>🧠 AI Lab</h1>
<p>Interface de test pour l'API NLP</p>
</div>
<div class="api-status" id="apiStatus">
<div class="status-indicator offline"></div>
<span>API Déconnectée</span>
</div>
</div>
</div>
</header>
<!-- Main Content -->
<main class="main">
<div class="container">
<!-- API Configuration -->
<section class="config-section">
<div class="config-card">
<h3>⚙️ Configuration API</h3>
<div class="config-form">
<div class="input-group">
<label for="apiUrl">URL de l'API</label>
<input type="url" id="apiUrl" value="http://localhost:8000"
placeholder="http://localhost:8000">
</div>
<button class="btn btn-secondary" onclick="checkApiStatus()">
<span class="btn-icon">🔄</span>
Tester la connexion
</button>
</div>
</div>
</section>
<!-- Navigation -->
<nav class="nav-section">
<div class="nav-tabs">
<button class="nav-tab active" data-tab="sentiment">
<span class="tab-icon">💭</span>
Sentiment
</button>
<button class="nav-tab" data-tab="ner">
<span class="tab-icon">🏷️</span>
NER
</button>
<button class="nav-tab" data-tab="qa">
<span class="tab-icon"></span>
Q&A
</button>
<button class="nav-tab" data-tab="fillmask">
<span class="tab-icon">🎭</span>
Fill Mask
</button>
<button class="nav-tab" data-tab="moderation">
<span class="tab-icon">🛡️</span>
Modération
</button>
<button class="nav-tab" data-tab="textgen">
<span class="tab-icon">✍️</span>
Génération
</button>
<button class="nav-tab" data-tab="batch">
<span class="tab-icon">📦</span>
Batch
</button>
</div>
</nav>
<!-- Tab Contents -->
<div class="tab-contents">
<!-- Sentiment Analysis -->
<div class="tab-content active" id="sentiment">
<div class="content-card">
<div class="card-header">
<h2>💭 Analyse de Sentiment</h2>
<p>Analysez l'émotion et le ton d'un texte</p>
</div>
<form class="form" onsubmit="analyzeSentiment(event)">
<div class="form-group">
<label for="sentimentText">Texte à analyser</label>
<textarea id="sentimentText" rows="4" placeholder="Enter your text here..."
required></textarea>
</div>
<div class="form-group">
<label for="sentimentModel">Modèle (optionnel)</label>
<select id="sentimentModel">
<option value="">Modèle par défaut</option>
<option value="cardiffnlp/twitter-roberta-base-sentiment-latest">Twitter RoBERTa
</option>
<option value="distilbert-base-uncased-finetuned-sst-2-english">DistilBERT SST-2
</option>
</select>
</div>
<div class="form-actions">
<button type="button" class="btn btn-secondary" onclick="loadExample('sentiment')">
<span class="btn-icon">💡</span>
Exemple
</button>
<button type="submit" class="btn btn-primary">
<span class="btn-icon">🔍</span>
Analyser
</button>
</div>
</form>
<div class="result-container" id="sentimentResult"></div>
</div>
</div>
<!-- Named Entity Recognition -->
<div class="tab-content" id="ner">
<div class="content-card">
<div class="card-header">
<h2>🏷️ Reconnaissance d'Entités Nommées</h2>
<p>Identifiez les personnes, lieux et organisations</p>
</div>
<form class="form" onsubmit="analyzeNER(event)">
<div class="form-group">
<label for="nerText">Texte à analyser</label>
<textarea id="nerText" rows="4" placeholder="Enter your text here..."
required></textarea>
</div>
<div class="form-group">
<label for="nerModel">Modèle (optionnel)</label>
<select id="nerModel">
<option value="">Modèle par défaut</option>
<option value="dbmdz/bert-large-cased-finetuned-conll03-english">BERT Large
CoNLL03</option>
<option value="dslim/bert-base-NER">BERT Base NER</option>
</select>
</div>
<div class="form-actions">
<button type="button" class="btn btn-secondary" onclick="loadExample('ner')">
<span class="btn-icon">💡</span>
Exemple
</button>
<button type="submit" class="btn btn-primary">
<span class="btn-icon">🔍</span>
Analyser
</button>
</div>
</form>
<div class="result-container" id="nerResult"></div>
</div>
</div>
<!-- Question Answering -->
<div class="tab-content" id="qa">
<div class="content-card">
<div class="card-header">
<h2>❓ Questions-Réponses</h2>
<p>Obtenez des réponses basées sur un contexte</p>
</div>
<form class="form" onsubmit="answerQuestion(event)">
<div class="form-row">
<div class="form-group">
<label for="qaQuestion">Question</label>
<input type="text" id="qaQuestion" placeholder="Ask your question..." required>
</div>
<div class="form-group">
<label for="qaModel">Modèle (optionnel)</label>
<select id="qaModel">
<option value="">Modèle par défaut</option>
<option value="distilbert-base-cased-distilled-squad">DistilBERT SQuAD
</option>
</select>
</div>
</div>
<div class="form-group">
<label for="qaContext">Contexte</label>
<textarea id="qaContext" rows="4"
placeholder="Enter the context to answer the question..." required></textarea>
</div>
<div class="form-actions">
<button type="button" class="btn btn-secondary" onclick="loadExample('qa')">
<span class="btn-icon">💡</span>
Exemple
</button>
<button type="submit" class="btn btn-primary">
<span class="btn-icon">🔍</span>
Répondre
</button>
</div>
</form>
<div class="result-container" id="qaResult"></div>
</div>
</div>
<!-- Fill Mask -->
<div class="tab-content" id="fillmask">
<div class="content-card">
<div class="card-header">
<h2>🎭 Complétion de Masques</h2>
<p>Prédisez les mots manquants avec [MASK]</p>
</div>
<form class="form" onsubmit="fillMask(event)">
<div class="form-group">
<label for="fillmaskText">Texte avec [MASK]</label>
<textarea id="fillmaskText" rows="4" placeholder="Enter your text with [MASK]..."
required></textarea>
</div>
<div class="form-group">
<label for="fillmaskModel">Modèle (optionnel)</label>
<select id="fillmaskModel">
<option value="">Modèle par défaut</option>
<option value="bert-base-uncased">BERT Base</option>
<option value="distilbert-base-uncased">DistilBERT</option>
</select>
</div>
<div class="form-actions">
<button type="button" class="btn btn-secondary" onclick="loadExample('fillmask')">
<span class="btn-icon">💡</span>
Exemple
</button>
<button type="submit" class="btn btn-primary">
<span class="btn-icon">🔍</span>
Compléter
</button>
</div>
</form>
<div class="result-container" id="fillmaskResult"></div>
</div>
</div>
<!-- Content Moderation -->
<div class="tab-content" id="moderation">
<div class="content-card">
<div class="card-header">
<h2>🛡️ Modération de Contenu</h2>
<p>Détectez le contenu toxique ou inapproprié</p>
</div>
<form class="form" onsubmit="moderateContent(event)">
<div class="form-group">
<label for="moderationText">Texte à modérer</label>
<textarea id="moderationText" rows="4" placeholder="Enter your text here..."
required></textarea>
</div>
<div class="form-group">
<label for="moderationModel">Modèle (optionnel)</label>
<select id="moderationModel">
<option value="">Modèle par défaut</option>
<option value="unitary/toxic-bert">Toxic BERT</option>
</select>
</div>
<div class="form-actions">
<button type="button" class="btn btn-secondary" onclick="loadExample('moderation')">
<span class="btn-icon">💡</span>
Exemple
</button>
<button type="submit" class="btn btn-primary">
<span class="btn-icon">🔍</span>
Modérer
</button>
</div>
</form>
<div class="result-container" id="moderationResult"></div>
</div>
</div>
<!-- Text Generation -->
<div class="tab-content" id="textgen">
<div class="content-card">
<div class="card-header">
<h2>✍️ Génération de Texte</h2>
<p>Générez du texte créatif à partir d'un prompt</p>
</div>
<form class="form" onsubmit="generateText(event)">
<div class="form-group">
<label for="textgenPrompt">Prompt</label>
<textarea id="textgenPrompt" rows="4" placeholder="Enter your prompt..."
required></textarea>
</div>
<div class="form-group">
<label for="textgenModel">Modèle (optionnel)</label>
<select id="textgenModel">
<option value="">Modèle par défaut</option>
<option value="gpt2">GPT-2</option>
<option value="gpt2-medium">GPT-2 Medium</option>
</select>
</div>
<div class="form-actions">
<button type="button" class="btn btn-secondary" onclick="loadExample('textgen')">
<span class="btn-icon">💡</span>
Exemple
</button>
<button type="submit" class="btn btn-primary">
<span class="btn-icon"></span>
Générer
</button>
</div>
</form>
<div class="result-container" id="textgenResult"></div>
</div>
</div>
<!-- Batch Processing -->
<div class="tab-content" id="batch">
<div class="content-card">
<div class="card-header">
<h2>📦 Traitement par Lot</h2>
<p>Analysez plusieurs textes simultanément</p>
</div>
<div class="form">
<div class="form-group">
<label for="batchType">Type d'analyse</label>
<select id="batchType">
<option value="sentiment">Sentiment</option>
<option value="ner">NER</option>
<option value="fillmask">Fill Mask</option>
<option value="moderation">Modération</option>
<option value="textgen">Génération</option>
</select>
</div>
<div class="form-group">
<label for="batchTexts">Textes (un par ligne)</label>
<textarea id="batchTexts" rows="6" placeholder="Enter your texts, one per line..."
required></textarea>
</div>
<div class="form-actions">
<button type="button" class="btn btn-secondary" onclick="loadExample('batch')">
<span class="btn-icon">💡</span>
Exemple
</button>
<button type="button" class="btn btn-primary" onclick="processBatch()">
<span class="btn-icon">🚀</span>
Traiter le lot
</button>
</div>
</div>
<div class="result-container" id="batchResult"></div>
</div>
</div>
</div>
</div>
</main>
</div>
<script src="script.js"></script>
</body>
</html>

1057
ui/script.js Normal file

File diff suppressed because it is too large Load Diff

1436
ui/style.css Normal file

File diff suppressed because it is too large Load Diff