---
title: BAAI Vector Api
emoji: 🚀
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
models:
- Noblhyon/BAAI_Vector_Api
tags:
- embeddings
- sentence-similarity
- multilingual
- retrieval
- bge-m3
- flask
- api
---

# BGE-M3 Vector API 🚀

A Flask-based REST API for the **BGE-M3** embedding model, featuring multi-functionality, multi-linguality, and multi-granularity text processing.

## 🌟 Features

### Multi-Functionality
- **Dense Retrieval**: Traditional single-vector embeddings
- **Sparse Retrieval**: Lexical matching similar to BM25
- **Multi-Vector**: ColBERT-style token-level representations

### Multi-Linguality
- Support for **100+ languages**
- Cross-lingual semantic understanding
- Unified multilingual representation

### Multi-Granularity
- Process text from short sentences to long documents
- Handle up to **8192 tokens** in a single input
- Consistent performance across different text lengths

## 🔧 API Endpoints

### Base Information
- `GET /` - API information and available endpoints
- `GET /health` - Health check endpoint

### Core Functionality
- `POST /embed` - Generate embeddings for text(s)
- `POST /similarity` - Compute similarity between text pairs
- `POST /search` - Search through documents using semantic similarity

## 📚 API Usage Examples

### 1. Generate Embeddings

```bash
curl -X POST https://huggingface.co/spaces/Noblhyon/BAAI_Vector_Api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "texts": ["Hello world", "How are you?"],
    "return_dense": true,
    "return_sparse": false,
    "max_length": 512
  }'
```

**Response:**
```json
{
  "success": true,
  "num_texts": 2,
  "processing_time": 0.123,
  "dense_embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]],
  "dense_shape": [2, 1024]
}
```

### 2. Compute Similarity

```bash
curl -X POST https://huggingface.co/spaces/Noblhyon/BAAI_Vector_Api/similarity \
  -H "Content-Type: application/json" \
  -d '{
    "pairs": [["Hello world", "Hi there"], ["Cat", "Dog"]],
    "method": "all"
  }'
```

**Response:**
```json
{
  "success": true,
  "method": "all",
  "num_pairs": 2,
  "processing_time": 0.234,
  "scores": {
    "dense": [0.8234, 0.4567],
    "sparse": [0.1234, 0.0567],
    "colbert": [0.7890, 0.5432],
    "combined": [0.7456, 0.4123]
  }
}
```

### 3. Document Search

```bash
curl -X POST https://huggingface.co/spaces/Noblhyon/BAAI_Vector_Api/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "machine learning",
    "documents": [
      "Deep learning is a subset of machine learning",
      "Cats are cute animals",
      "Neural networks are used in AI"
    ],
    "top_k": 2
  }'
```

**Response:**
```json
{
  "success": true,
  "query": "machine learning",
  "num_documents": 3,
  "top_k": 2,
  "processing_time": 0.345,
  "results": [
    {
      "rank": 1,
      "document_index": 0,
      "document": "Deep learning is a subset of machine learning",
      "similarity_score": 0.8765
    },
    {
      "rank": 2,
      "document_index": 2,
      "document": "Neural networks are used in AI",
      "similarity_score": 0.6543
    }
  ]
}
```

## 🔧 Model Details

- **Model**: `Noblhyon/BAAI_Vector_Api`
- **Architecture**: XLM-RoBERTa based
- **Embedding Dimension**: 1024
- **Max Sequence Length**: 8192 tokens
- **Languages**: 100+ supported

## 🚀 Python Client Example

```python
import requests
import json

# API base URL
BASE_URL = "https://huggingface.co/spaces/Noblhyon/BAAI_Vector_Api"

def get_embeddings(texts):
    response = requests.post(
        f"{BASE_URL}/embed",
        json={
            "texts": texts,
            "return_dense": True,
            "max_length": 512
        }
    )
    return response.json()

def compute_similarity(text1, text2):
    response = requests.post(
        f"{BASE_URL}/similarity",
        json={
            "pairs": [[text1, text2]],
            "method": "all"
        }
    )
    return response.json()

def search_documents(query, documents, top_k=5):
    response = requests.post(
        f"{BASE_URL}/search",
        json={
            "query": query,
            "documents": documents,
            "top_k": top_k
        }
    )
    return response.json()

# Example usage
embeddings = get_embeddings(["Hello world", "How are you?"])
similarity = compute_similarity("Hello", "Hi")
search_results = search_documents("AI", ["Machine learning", "Cooking", "Neural networks"])
```

## 📊 Performance

BGE-M3 achieves state-of-the-art performance on various benchmarks:
- **MIRACL**: Multilingual retrieval
- **MKQA**: Cross-lingual question answering  
- **MLDR**: Long document retrieval
- **NarritiveQA**: Long text understanding

## 📚 Citation

```bibtex
@misc{bge-m3,
      title={BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation}, 
      author={Jianlv Chen and Shitao Xiao and Peitian Zhang and Kun Luo and Defu Lian and Zheng Liu},
      year={2024},
      eprint={2402.03216},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```

## 🔗 Links

- **Model Repository**: [Noblhyon/BAAI_Vector_Api](https://huggingface.co/Noblhyon/BAAI_Vector_Api)
- **Original Paper**: [BGE M3-Embedding](https://arxiv.org/pdf/2402.03216.pdf)
- **GitHub**: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding)

---

*Built with ❤️ using Flask and Docker*