--- title: BAAI Vector Api emoji: 🚀 colorFrom: blue colorTo: purple sdk: docker app_port: 7860 pinned: false license: mit models: - Noblhyon/BAAI_Vector_Api tags: - embeddings - sentence-similarity - multilingual - retrieval - bge-m3 - flask - api --- # BGE-M3 Vector API 🚀 A Flask-based REST API for the **BGE-M3** embedding model, featuring multi-functionality, multi-linguality, and multi-granularity text processing. ## 🌟 Features ### Multi-Functionality - **Dense Retrieval**: Traditional single-vector embeddings - **Sparse Retrieval**: Lexical matching similar to BM25 - **Multi-Vector**: ColBERT-style token-level representations ### Multi-Linguality - Support for **100+ languages** - Cross-lingual semantic understanding - Unified multilingual representation ### Multi-Granularity - Process text from short sentences to long documents - Handle up to **8192 tokens** in a single input - Consistent performance across different text lengths ## 🔧 API Endpoints ### Base Information - `GET /` - API information and available endpoints - `GET /health` - Health check endpoint ### Core Functionality - `POST /embed` - Generate embeddings for text(s) - `POST /similarity` - Compute similarity between text pairs - `POST /search` - Search through documents using semantic similarity ## 📚 API Usage Examples ### 1. Generate Embeddings ```bash curl -X POST https://huggingface.co/spaces/Noblhyon/BAAI_Vector_Api/embed \ -H "Content-Type: application/json" \ -d '{ "texts": ["Hello world", "How are you?"], "return_dense": true, "return_sparse": false, "max_length": 512 }' ``` **Response:** ```json { "success": true, "num_texts": 2, "processing_time": 0.123, "dense_embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]], "dense_shape": [2, 1024] } ``` ### 2. Compute Similarity ```bash curl -X POST https://huggingface.co/spaces/Noblhyon/BAAI_Vector_Api/similarity \ -H "Content-Type: application/json" \ -d '{ "pairs": [["Hello world", "Hi there"], ["Cat", "Dog"]], "method": "all" }' ``` **Response:** ```json { "success": true, "method": "all", "num_pairs": 2, "processing_time": 0.234, "scores": { "dense": [0.8234, 0.4567], "sparse": [0.1234, 0.0567], "colbert": [0.7890, 0.5432], "combined": [0.7456, 0.4123] } } ``` ### 3. Document Search ```bash curl -X POST https://huggingface.co/spaces/Noblhyon/BAAI_Vector_Api/search \ -H "Content-Type: application/json" \ -d '{ "query": "machine learning", "documents": [ "Deep learning is a subset of machine learning", "Cats are cute animals", "Neural networks are used in AI" ], "top_k": 2 }' ``` **Response:** ```json { "success": true, "query": "machine learning", "num_documents": 3, "top_k": 2, "processing_time": 0.345, "results": [ { "rank": 1, "document_index": 0, "document": "Deep learning is a subset of machine learning", "similarity_score": 0.8765 }, { "rank": 2, "document_index": 2, "document": "Neural networks are used in AI", "similarity_score": 0.6543 } ] } ``` ## 🔧 Model Details - **Model**: `Noblhyon/BAAI_Vector_Api` - **Architecture**: XLM-RoBERTa based - **Embedding Dimension**: 1024 - **Max Sequence Length**: 8192 tokens - **Languages**: 100+ supported ## 🚀 Python Client Example ```python import requests import json # API base URL BASE_URL = "https://huggingface.co/spaces/Noblhyon/BAAI_Vector_Api" def get_embeddings(texts): response = requests.post( f"{BASE_URL}/embed", json={ "texts": texts, "return_dense": True, "max_length": 512 } ) return response.json() def compute_similarity(text1, text2): response = requests.post( f"{BASE_URL}/similarity", json={ "pairs": [[text1, text2]], "method": "all" } ) return response.json() def search_documents(query, documents, top_k=5): response = requests.post( f"{BASE_URL}/search", json={ "query": query, "documents": documents, "top_k": top_k } ) return response.json() # Example usage embeddings = get_embeddings(["Hello world", "How are you?"]) similarity = compute_similarity("Hello", "Hi") search_results = search_documents("AI", ["Machine learning", "Cooking", "Neural networks"]) ``` ## 📊 Performance BGE-M3 achieves state-of-the-art performance on various benchmarks: - **MIRACL**: Multilingual retrieval - **MKQA**: Cross-lingual question answering - **MLDR**: Long document retrieval - **NarritiveQA**: Long text understanding ## 📚 Citation ```bibtex @misc{bge-m3, title={BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation}, author={Jianlv Chen and Shitao Xiao and Peitian Zhang and Kun Luo and Defu Lian and Zheng Liu}, year={2024}, eprint={2402.03216}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` ## 🔗 Links - **Model Repository**: [Noblhyon/BAAI_Vector_Api](https://huggingface.co/Noblhyon/BAAI_Vector_Api) - **Original Paper**: [BGE M3-Embedding](https://arxiv.org/pdf/2402.03216.pdf) - **GitHub**: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding) --- *Built with ❤️ using Flask and Docker*