---
ttitle: Ece Intelligence Lab
emoji: 🚀
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.42.0
python_version: "3.10"
app_file: app.py
pinned: false
---

# 🧠 ECE Intelligence Lab — Assistant IA RAG

> Chatbot RAG (Retrieval-Augmented Generation) pour l'**Intelligence Lab de l'ECE Paris** — premier *Fab IA* de France. Répond aux questions sur la vision, les modèles IA, l'infrastructure, les partenaires et la pédagogie du lab.

![Python](https://img.shields.io/badge/Python-3.10%2B-blue)
![LangChain](https://img.shields.io/badge/LangChain-0.2-green)
![HuggingFace](https://img.shields.io/badge/🤗_Hugging_Face-Mistral-yellow)
![FAISS](https://img.shields.io/badge/FAISS-Vector_Search-red)

🚀 **Demo live** : [huggingface.co/spaces/RayanMLK/ece-intelligence-lab](https://huggingface.co/spaces/RayanMLK/ece-intelligence-lab)

---

## 📖 Qu'est-ce que le RAG ?

**RAG = Retrieval-Augmented Generation** — une technique qui évite les hallucinations des LLMs en ancrant les réponses dans des documents réels.

```
Question utilisateur
        │
        ▼
[Embeddings] ── recherche sémantique ──► [FAISS Vector Store]
        │                                        │
        │                          top-k chunks pertinents
        ▼                                        │
[Prompt Template] ◄─────────────────────────────┘
        │
        ▼
[Mistral-7B via HF Inference API]
        │
        ▼
Réponse + Sources
```

---

## 🗂️ Architecture du projet

```
rag-chatbot/
│
├── app.py                  # Interface Gradio (HF Spaces)
├── app_chainlit.py         # Interface Chainlit (local, style ChatGPT)
├── ingest.py               # Script d'ingestion des documents (à lancer 1 fois)
│
├── src/
│   ├── document_loader.py  # Chargement PDF/TXT/DOCX + découpage en chunks
│   ├── vector_store.py     # Embeddings FAISS + persistance
│   └── rag_chain.py        # Pipeline RAG : Retriever + Prompt + LLM
│
├── data/
│   └── documents/          # 📂 Base de connaissances ECE Intelligence Lab
│
├── tests/
│   └── test_pipeline.py    # Tests unitaires (pytest)
│
├── .env.example            # Template des variables d'environnement
├── .gitignore
├── requirements.txt
└── README.md
```

**Chaque fichier a une responsabilité unique et claire.**

---

## 🚀 Installation locale

### 1. Clone et installation

```bash
git clone https://github.com/Rayanmlk/rag-chatbot.git
cd rag-chatbot
python -m venv venv
venv\Scripts\activate        # Windows
pip install -r requirements.txt
```

### 2. Configuration

```bash
cp .env.example .env
# Ajoute ton token Hugging Face dans .env
# Obtiens-le sur : https://huggingface.co/settings/tokens
```

### 3. Ingestion des documents (1 seule fois)

```bash
python ingest.py
```

### 4. Lancer le chatbot

```bash
# Interface Chainlit (recommandée en local)
chainlit run app_chainlit.py
# → http://localhost:8000

# Interface Gradio
python app.py
# → http://localhost:7860
```

---

## ⚙️ Configuration (.env)

| Variable | Défaut | Description |
|----------|--------|-------------|
| `HUGGINGFACE_API_TOKEN` | — | **Requis.** Token HF (Read) |
| `LLM_MODEL_ID` | `mistralai/Mistral-7B-Instruct-v0.2` | Modèle LLM |
| `EMBEDDING_MODEL_ID` | `sentence-transformers/all-MiniLM-L6-v2` | Modèle d'embeddings |
| `RETRIEVER_TOP_K` | `4` | Chunks récupérés par requête |
| `MAX_NEW_TOKENS` | `512` | Tokens max générés |
| `TEMPERATURE` | `0.3` | 0 = factuel, 1 = créatif |

---

## 🧪 Tests

```bash
pytest tests/ -v
```

---

## 🔧 Stack technique

| Librairie | Rôle |
|-----------|------|
| **LangChain** | Orchestration du pipeline RAG (LCEL) |
| **Hugging Face** | Inférence LLM via API (Mistral-7B) |
| **sentence-transformers** | Modèle d'embeddings local |
| **FAISS** | Recherche vectorielle rapide |
| **Gradio** | Interface web (HF Spaces) |
| **Chainlit** | Interface locale style ChatGPT |

---

## 📚 Références

- [ECE Intelligence Lab](https://www.ece.fr/intelligence-lab/)
- [LangChain RAG](https://python.langchain.com/docs/use_cases/question_answering/)
- [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index)
- [Article RAG original — Lewis et al., 2020](https://arxiv.org/abs/2005.11401)

---

*Projet réalisé par **Rayan MALKI** — Étudiant M1 Data & IA