SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Deposium Benchmark Results (2026-02-25)

ONNX INT8 version available: tss-deposium/bge-m3-matryoshka-1024d-onnx-int8 — CPU-optimized, ~571 MB, same quality.

Why Matryoshka?

This model was fine-tuned with MatryoshkaLoss to allow safe dimension truncation. Instead of training separate models for each dimension, a single model produces embeddings where the first N dimensions form a valid, high-quality embedding space.

embedding = model.encode("Bonjour")  # [1024] full
embedding_512 = embedding[:512]       # [512] still good! (-5.4% quality, -50% storage)

Benchmark: Discrimination by Dimension

Tested on 4 semantic pairs (FR/EN cross-lingual) + 2 negative pairs. Discrimination = avg_positive_similarity - avg_negative_similarity (higher = better).

Model Dim Discrim vs M2V baseline Storage (1M vec)
m2v-bge-m3-1024d (GPU) 1024 0.312 baseline 4 GB
gpahal/bge-m3-onnx-int8 1024 0.377 +20.8% 4 GB
This model 1024 0.403 +29.2% 4 GB
This model 768 0.397 +27.2% 3 GB
This model 512 0.381 +22.1% 2 GB
This model 256 anomalous — 1 GB

Key finding: At 512D, this model STILL outperforms full-resolution bge-m3-onnx-int8 at 1024D (+1.1%), with half the storage.

Per-Pair Cosine Similarity

Pair onnx-1024 matr-1024 matr-768 matr-512 matr-256
couple_serrage (FR/EN) 0.351 0.284 0.321 0.371 0.197
fogg_depart (FR/FR) 0.837 0.866 0.873 0.878 0.825
revenue_q2 (EN/FR) 0.645 0.686 0.697 0.711 0.658
moteur_spec (FR/EN) 0.946 0.954 0.954 0.955 0.950

Recommended Dimensions

  • 1024D: Maximum quality, use when storage is not a concern
  • 768D: Safe (-1.4%), -25% storage. Good default.
  • 512D: Best cost/quality ratio for cloud CPU. Still beats bge-m3-onnx at 1024D.
  • 256D: Not recommended — loses cross-lingual understanding.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Four hikers are walking up stairs on a small hill.',
    'The people are outside.',
    "Four people are shown in a gritty basement setting with blue walls and a white door on the ceiling; two of the people wear black t-shirts with a skull-and-crossbones and the words' starve poverty'.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6442, 0.2925],
#         [0.6442, 1.0000, 0.2358],
#         [0.2925, 0.2358, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 672,676 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 5 tokens
    • mean: 19.75 tokens
    • max: 254 tokens
    • min: 3 tokens
    • mean: 34.59 tokens
    • max: 499 tokens
  • Samples:
    anchor positive
    As the brown dog looks the other way, a large black and white dog plays with a smaller black dog. Three dogs are together somewhere.
    Two young men in black punk rock clothing are sitting on the floor at a playground. two men are sitting in a playground
    The man is wearing a shirt. A man wearing a black t-shirt is playing seven string bass a stage.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024,
            768,
            512,
            256
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 35,405 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 5 tokens
    • mean: 19.06 tokens
    • max: 122 tokens
    • min: 4 tokens
    • mean: 36.56 tokens
    • max: 477 tokens
  • Samples:
    anchor positive
    A woman in a red dress and a man in a white suite are engaging in a ballet performance on a purple lit stage. A man and woman are dancing.
    A man wearing reflective sunglasses in a crowd. A man is wearing sunshades.
    Man with red shoes, white shirt and gray pants climbing. Man is climbing.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024,
            768,
            512,
            256
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 2e-05
  • warmup_ratio: 0.1
  • bf16: True
  • load_best_model_at_end: True
  • dataloader_pin_memory: False
  • gradient_checkpointing: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: False
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: True
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0048 100 1.2365 -
0.0095 200 1.1962 -
0.0143 300 0.8874 -
0.0190 400 0.907 -
0.0238 500 0.8883 0.7493
0.0285 600 0.7945 -
0.0333 700 0.7591 -
0.0381 800 0.7265 -
0.0428 900 0.782 -
0.0476 1000 0.6898 0.6750
0.0523 1100 0.7668 -
0.0571 1200 0.839 -
0.0618 1300 0.6697 -
0.0666 1400 0.7014 -
0.0714 1500 0.741 0.6327
0.0761 1600 0.7033 -
0.0809 1700 0.7595 -
0.0856 1800 0.6845 -
0.0904 1900 0.7591 -
0.0951 2000 0.7153 0.6154
0.0999 2100 0.6861 -
0.1047 2200 0.7151 -
0.1094 2300 0.5997 -
0.1142 2400 0.633 -
0.1189 2500 0.6318 0.6044
0.1237 2600 0.6289 -
0.1284 2700 0.631 -
0.1332 2800 0.6986 -
0.1380 2900 0.6594 -
0.1427 3000 0.6295 0.5899
0.1475 3100 0.6494 -
0.1522 3200 0.6393 -
0.1570 3300 0.5855 -
0.1617 3400 0.6057 -
0.1665 3500 0.6009 0.5922
0.1712 3600 0.6307 -
0.1760 3700 0.6032 -
0.1808 3800 0.5862 -
0.1855 3900 0.6514 -
0.1903 4000 0.5814 0.5605
0.1950 4100 0.7021 -
0.1998 4200 0.5975 -
0.2045 4300 0.6037 -
0.2093 4400 0.5936 -
0.2141 4500 0.6214 0.5786
0.2188 4600 0.6136 -
0.2236 4700 0.5722 -
0.2283 4800 0.6056 -
0.2331 4900 0.5931 -
0.2378 5000 0.666 0.5665
0.2426 5100 0.5996 -
0.2474 5200 0.6105 -
0.2521 5300 0.6273 -
0.2569 5400 0.6868 -
0.2616 5500 0.5339 0.5799
0.2664 5600 0.6471 -
0.2711 5700 0.5705 -
0.2759 5800 0.6521 -
0.2807 5900 0.6084 -
0.2854 6000 0.616 0.5630
0.2902 6100 0.6128 -
0.2949 6200 0.5838 -
0.2997 6300 0.5029 -
0.3044 6400 0.623 -
0.3092 6500 0.5841 0.5566
0.3140 6600 0.5746 -
0.3187 6700 0.5202 -
0.3235 6800 0.5921 -
0.3282 6900 0.5642 -
0.3330 7000 0.6183 0.5364
0.3377 7100 0.5632 -
0.3425 7200 0.5062 -
0.3473 7300 0.4998 -
0.3520 7400 0.5703 -
0.3568 7500 0.5544 0.5469
0.3615 7600 0.5461 -
0.3663 7700 0.5716 -
0.3710 7800 0.5733 -
0.3758 7900 0.5549 -
0.3806 8000 0.5658 0.5347
0.3853 8100 0.5841 -
0.3901 8200 0.5051 -
0.3948 8300 0.4764 -
0.3996 8400 0.626 -
0.4043 8500 0.5284 0.5159
0.4091 8600 0.5733 -
0.4139 8700 0.5064 -
0.4186 8800 0.5758 -
0.4234 8900 0.5735 -
0.4281 9000 0.5811 0.4957
0.4329 9100 0.4942 -
0.4376 9200 0.5554 -
0.4424 9300 0.5678 -
0.4472 9400 0.529 -
0.4519 9500 0.4851 0.4828
0.4567 9600 0.4621 -
0.4614 9700 0.5172 -
0.4662 9800 0.4862 -
0.4709 9900 0.4796 -
0.4757 10000 0.4548 0.4830
0.4804 10100 0.492 -
0.4852 10200 0.4963 -
0.4900 10300 0.4963 -
0.4947 10400 0.4664 -
0.4995 10500 0.4786 0.4889
0.5042 10600 0.4631 -
0.5090 10700 0.4932 -
0.5137 10800 0.5028 -
0.5185 10900 0.4905 -
0.5233 11000 0.4544 0.4653
0.5280 11100 0.4681 -
0.5328 11200 0.5148 -
0.5375 11300 0.4606 -
0.5423 11400 0.4743 -
0.5470 11500 0.4904 0.4522
0.5518 11600 0.526 -
0.5566 11700 0.4677 -
0.5613 11800 0.4964 -
0.5661 11900 0.5397 -
0.5708 12000 0.5114 0.4529
0.5756 12100 0.4969 -
0.5803 12200 0.4959 -
0.5851 12300 0.4258 -
0.5899 12400 0.4875 -
0.5946 12500 0.4807 0.4374
0.5994 12600 0.4994 -
0.6041 12700 0.3952 -
0.6089 12800 0.4229 -
0.6136 12900 0.466 -
0.6184 13000 0.4637 0.4240
0.6232 13100 0.4637 -
0.6279 13200 0.4177 -
0.6327 13300 0.4338 -
0.6374 13400 0.4296 -
0.6422 13500 0.4401 0.4241
0.6469 13600 0.4643 -
0.6517 13700 0.3955 -
0.6565 13800 0.4819 -
0.6612 13900 0.4793 -
0.6660 14000 0.458 0.4321
0.6707 14100 0.4382 -
0.6755 14200 0.4201 -
0.6802 14300 0.4567 -
0.6850 14400 0.4488 -
0.6898 14500 0.399 0.4185
0.6945 14600 0.3928 -
0.6993 14700 0.4477 -
0.7040 14800 0.4592 -
0.7088 14900 0.393 -
0.7135 15000 0.4394 0.4024
0.7183 15100 0.4117 -
0.7231 15200 0.3872 -
0.7278 15300 0.4194 -
0.7326 15400 0.3954 -
0.7373 15500 0.4439 0.3979
0.7421 15600 0.3534 -
0.7468 15700 0.4407 -
0.7516 15800 0.4586 -
0.7564 15900 0.3718 -
0.7611 16000 0.449 0.3999
0.7659 16100 0.4213 -
0.7706 16200 0.4192 -
0.7754 16300 0.4121 -
0.7801 16400 0.3409 -
0.7849 16500 0.388 0.3905
0.7896 16600 0.3648 -
0.7944 16700 0.4352 -
0.7992 16800 0.424 -
0.8039 16900 0.4363 -
0.8087 17000 0.426 0.3969
0.8134 17100 0.5142 -
0.8182 17200 0.3944 -
0.8229 17300 0.4604 -
0.8277 17400 0.3765 -
0.8325 17500 0.4707 0.3765
0.8372 17600 0.3848 -
0.8420 17700 0.3869 -
0.8467 17800 0.4391 -
0.8515 17900 0.4037 -
0.8562 18000 0.3394 0.3758
0.8610 18100 0.3987 -
0.8658 18200 0.3238 -
0.8705 18300 0.4504 -
0.8753 18400 0.4041 -
0.8800 18500 0.3812 0.3778
0.8848 18600 0.3602 -
0.8895 18700 0.3782 -
0.8943 18800 0.3781 -
0.8991 18900 0.4069 -
0.9038 19000 0.3682 0.3691
0.9086 19100 0.4038 -
0.9133 19200 0.3652 -
0.9181 19300 0.3383 -
0.9228 19400 0.4312 -
0.9276 19500 0.3837 0.3660
0.9324 19600 0.3733 -
0.9371 19700 0.3542 -
0.9419 19800 0.406 -
0.9466 19900 0.3632 -
0.9514 20000 0.3984 0.3783
0.9561 20100 0.3984 -
0.9609 20200 0.38 -
0.9657 20300 0.388 -
0.9704 20400 0.3766 -
0.9752 20500 0.3298 0.3498
0.9799 20600 0.3308 -
0.9847 20700 0.3884 -
0.9894 20800 0.3674 -
0.9942 20900 0.4107 -
0.9990 21000 0.3739 0.3513
1.0037 21100 0.3398 -
1.0085 21200 0.3711 -
1.0132 21300 0.265 -
1.0180 21400 0.3464 -
1.0227 21500 0.3265 0.3463
1.0275 21600 0.274 -
1.0323 21700 0.3063 -
1.0370 21800 0.2679 -
1.0418 21900 0.3099 -
1.0465 22000 0.35 0.3533
1.0513 22100 0.3021 -
1.0560 22200 0.3505 -
1.0608 22300 0.2589 -
1.0656 22400 0.3791 -
1.0703 22500 0.3113 0.3460
1.0751 22600 0.3624 -
1.0798 22700 0.3676 -
1.0846 22800 0.3194 -
1.0893 22900 0.343 -
1.0941 23000 0.3446 0.3370
1.0988 23100 0.4403 -
1.1036 23200 0.2646 -
1.1084 23300 0.3115 -
1.1131 23400 0.3024 -
1.1179 23500 0.3613 0.3407
1.1226 23600 0.306 -
1.1274 23700 0.298 -
1.1321 23800 0.3751 -
1.1369 23900 0.288 -
1.1417 24000 0.2877 0.3472
1.1464 24100 0.2986 -
1.1512 24200 0.3018 -
1.1559 24300 0.3603 -
1.1607 24400 0.3413 -
1.1654 24500 0.3171 0.3266
1.1702 24600 0.3096 -
1.1750 24700 0.348 -
1.1797 24800 0.3971 -
1.1845 24900 0.3127 -
1.1892 25000 0.3162 0.3256
1.1940 25100 0.3035 -
1.1987 25200 0.2711 -
1.2035 25300 0.2615 -
1.2083 25400 0.301 -
1.2130 25500 0.3146 0.3297
1.2178 25600 0.3389 -
1.2225 25700 0.3027 -
1.2273 25800 0.329 -
1.2320 25900 0.3478 -
1.2368 26000 0.2924 0.3179
1.2416 26100 0.331 -
1.2463 26200 0.3109 -
1.2511 26300 0.3033 -
1.2558 26400 0.2905 -
1.2606 26500 0.2989 0.3228
1.2653 26600 0.3156 -
1.2701 26700 0.3124 -
1.2749 26800 0.3052 -
1.2796 26900 0.272 -
1.2844 27000 0.3114 0.3213
1.2891 27100 0.3205 -
1.2939 27200 0.279 -
1.2986 27300 0.2678 -
1.3034 27400 0.2663 -
1.3082 27500 0.2885 0.3159
1.3129 27600 0.2889 -
1.3177 27700 0.327 -
1.3224 27800 0.3169 -
1.3272 27900 0.3398 -
1.3319 28000 0.2835 0.3097
1.3367 28100 0.3434 -
1.3415 28200 0.2885 -
1.3462 28300 0.3164 -
1.3510 28400 0.3618 -
1.3557 28500 0.26 0.3106
1.3605 28600 0.2671 -
1.3652 28700 0.2745 -
1.3700 28800 0.2531 -
1.3748 28900 0.2954 -
1.3795 29000 0.2679 0.3106
1.3843 29100 0.3344 -
1.3890 29200 0.3315 -
1.3938 29300 0.2603 -
1.3985 29400 0.2822 -
1.4033 29500 0.3416 0.3012
1.4080 29600 0.3274 -
1.4128 29700 0.3179 -
1.4176 29800 0.2861 -
1.4223 29900 0.2574 -
1.4271 30000 0.2261 0.3072
1.4318 30100 0.318 -
1.4366 30200 0.2942 -
1.4413 30300 0.2831 -
1.4461 30400 0.2801 -
1.4509 30500 0.2433 0.2977
1.4556 30600 0.2805 -
1.4604 30700 0.2909 -
1.4651 30800 0.2996 -
1.4699 30900 0.2945 -
1.4746 31000 0.2686 0.2890
1.4794 31100 0.2498 -
1.4842 31200 0.3214 -
1.4889 31300 0.281 -
1.4937 31400 0.25 -
1.4984 31500 0.2648 0.2862
1.5032 31600 0.297 -
1.5079 31700 0.298 -
1.5127 31800 0.2675 -
1.5175 31900 0.268 -
1.5222 32000 0.2662 0.2827
1.5270 32100 0.2227 -
1.5317 32200 0.2764 -
1.5365 32300 0.2499 -
1.5412 32400 0.2789 -
1.5460 32500 0.2522 0.2806
1.5508 32600 0.3053 -
1.5555 32700 0.2367 -
1.5603 32800 0.3354 -
1.5650 32900 0.2504 -
1.5698 33000 0.2766 0.2782
1.5745 33100 0.2338 -
1.5793 33200 0.2539 -
1.5841 33300 0.3004 -
1.5888 33400 0.2705 -
1.5936 33500 0.2613 0.2759
1.5983 33600 0.2618 -
1.6031 33700 0.2459 -
1.6078 33800 0.2349 -
1.6126 33900 0.3481 -
1.6174 34000 0.2243 0.2793
1.6221 34100 0.2564 -
1.6269 34200 0.2643 -
1.6316 34300 0.356 -
1.6364 34400 0.2273 -
1.6411 34500 0.2577 0.2724
1.6459 34600 0.2958 -
1.6507 34700 0.2778 -
1.6554 34800 0.2608 -
1.6602 34900 0.2667 -
1.6649 35000 0.2377 0.2736
1.6697 35100 0.2787 -
1.6744 35200 0.2698 -
1.6792 35300 0.2505 -
1.6840 35400 0.2794 -
1.6887 35500 0.2382 0.2656
1.6935 35600 0.2373 -
1.6982 35700 0.2303 -
1.7030 35800 0.1983 -
1.7077 35900 0.29 -
1.7125 36000 0.2608 0.2707
1.7172 36100 0.2978 -
1.7220 36200 0.3158 -
1.7268 36300 0.2679 -
1.7315 36400 0.245 -
1.7363 36500 0.2423 0.2768
1.7410 36600 0.2301 -
1.7458 36700 0.2189 -
1.7505 36800 0.2335 -
1.7553 36900 0.2773 -
1.7601 37000 0.2448 0.2723
1.7648 37100 0.2404 -
1.7696 37200 0.2733 -
1.7743 37300 0.2075 -
1.7791 37400 0.2489 -
1.7838 37500 0.2678 0.2564
1.7886 37600 0.2473 -
1.7934 37700 0.2401 -
1.7981 37800 0.2334 -
1.8029 37900 0.2712 -
1.8076 38000 0.2631 0.2622
1.8124 38100 0.2375 -
1.8171 38200 0.2644 -
1.8219 38300 0.2028 -
1.8267 38400 0.2653 -
1.8314 38500 0.2161 0.2590
1.8362 38600 0.2494 -
1.8409 38700 0.2457 -
1.8457 38800 0.2316 -
1.8504 38900 0.1991 -
1.8552 39000 0.2342 0.2565
1.8600 39100 0.2326 -
1.8647 39200 0.25 -
1.8695 39300 0.237 -
1.8742 39400 0.2329 -
1.8790 39500 0.2613 0.2566
1.8837 39600 0.2363 -
1.8885 39700 0.2362 -
1.8933 39800 0.2354 -
1.8980 39900 0.2374 -
1.9028 40000 0.2586 0.2545
1.9075 40100 0.2231 -
1.9123 40200 0.2653 -
1.9170 40300 0.2537 -
1.9218 40400 0.206 -
1.9266 40500 0.2342 0.2506
1.9313 40600 0.2343 -
1.9361 40700 0.2034 -
1.9408 40800 0.2383 -
1.9456 40900 0.2805 -
1.9503 41000 0.2499 0.2474
1.9551 41100 0.3116 -
1.9599 41200 0.2522 -
1.9646 41300 0.2264 -
1.9694 41400 0.2398 -
1.9741 41500 0.2239 0.2487
1.9789 41600 0.2299 -
1.9836 41700 0.2262 -
1.9884 41800 0.2522 -
1.9932 41900 0.2332 -
1.9979 42000 0.2221 0.2487
2.0027 42100 0.245 -
2.0074 42200 0.1898 -
2.0122 42300 0.2015 -
2.0169 42400 0.2135 -
2.0217 42500 0.2153 0.2395
2.0264 42600 0.1568 -
2.0312 42700 0.2178 -
2.0360 42800 0.1757 -
2.0407 42900 0.239 -
2.0455 43000 0.1538 0.2430
2.0502 43100 0.1727 -
2.0550 43200 0.153 -
2.0597 43300 0.1773 -
2.0645 43400 0.1752 -
2.0693 43500 0.1586 0.2416
2.0740 43600 0.2497 -
2.0788 43700 0.217 -
2.0835 43800 0.2227 -
2.0883 43900 0.1811 -
2.0930 44000 0.2125 0.2422
2.0978 44100 0.2005 -
2.1026 44200 0.1776 -
2.1073 44300 0.186 -
2.1121 44400 0.2546 -
2.1168 44500 0.1598 0.2377
2.1216 44600 0.2231 -
2.1263 44700 0.1524 -
2.1311 44800 0.1786 -
2.1359 44900 0.1788 -
2.1406 45000 0.2073 0.2372
2.1454 45100 0.1347 -
2.1501 45200 0.1523 -
2.1549 45300 0.2168 -
2.1596 45400 0.1498 -
2.1644 45500 0.2213 0.2299
2.1692 45600 0.1809 -
2.1739 45700 0.1969 -
2.1787 45800 0.2001 -
2.1834 45900 0.2014 -
2.1882 46000 0.1711 0.2328
2.1929 46100 0.2257 -
2.1977 46200 0.1634 -
2.2025 46300 0.1698 -
2.2072 46400 0.1837 -
2.2120 46500 0.1665 0.2330
2.2167 46600 0.197 -
2.2215 46700 0.1567 -
2.2262 46800 0.1762 -
2.2310 46900 0.1646 -
2.2358 47000 0.2108 0.2322
2.2405 47100 0.2234 -
2.2453 47200 0.2163 -
2.2500 47300 0.188 -
2.2548 47400 0.1846 -
2.2595 47500 0.1794 0.2273
2.2643 47600 0.2637 -
2.2691 47700 0.1596 -
2.2738 47800 0.1676 -
2.2786 47900 0.2099 -
2.2833 48000 0.2002 0.2298
2.2881 48100 0.153 -
2.2928 48200 0.2079 -
2.2976 48300 0.2117 -
2.3023 48400 0.2472 -
2.3071 48500 0.1786 0.2223
2.3119 48600 0.1416 -
2.3166 48700 0.1869 -
2.3214 48800 0.185 -
2.3261 48900 0.1763 -
2.3309 49000 0.1533 0.2245
2.3356 49100 0.1856 -
2.3404 49200 0.2195 -
2.3452 49300 0.1748 -
2.3499 49400 0.1773 -
2.3547 49500 0.1546 0.2228
2.3594 49600 0.1543 -
2.3642 49700 0.208 -
2.3689 49800 0.1735 -
2.3737 49900 0.1463 -
2.3785 50000 0.2065 0.2225
2.3832 50100 0.1651 -
2.3880 50200 0.2091 -
2.3927 50300 0.1427 -
2.3975 50400 0.2033 -
2.4022 50500 0.1541 0.2206
2.4070 50600 0.1508 -
2.4118 50700 0.1693 -
2.4165 50800 0.2133 -
2.4213 50900 0.1709 -
2.4260 51000 0.1339 0.2209
2.4308 51100 0.1961 -
2.4355 51200 0.1569 -
2.4403 51300 0.1595 -
2.4451 51400 0.2285 -
2.4498 51500 0.1765 0.2175
2.4546 51600 0.1913 -
2.4593 51700 0.2017 -
2.4641 51800 0.158 -
2.4688 51900 0.2082 -
2.4736 52000 0.244 0.2116
2.4784 52100 0.1674 -
2.4831 52200 0.192 -
2.4879 52300 0.1793 -
2.4926 52400 0.1776 -
2.4974 52500 0.1644 0.2125
2.5021 52600 0.1668 -
2.5069 52700 0.2223 -
2.5117 52800 0.1969 -
2.5164 52900 0.2236 -
2.5212 53000 0.1869 0.2099
2.5259 53100 0.1664 -
2.5307 53200 0.1799 -
2.5354 53300 0.177 -
2.5402 53400 0.1515 -
2.5450 53500 0.1993 0.2111
2.5497 53600 0.163 -
2.5545 53700 0.1992 -
2.5592 53800 0.1932 -
2.5640 53900 0.1957 -
2.5687 54000 0.1464 0.2107
2.5735 54100 0.1961 -
2.5783 54200 0.2057 -
2.5830 54300 0.1703 -
2.5878 54400 0.1883 -
2.5925 54500 0.2052 0.2103
2.5973 54600 0.1601 -
2.6020 54700 0.1901 -
2.6068 54800 0.162 -
2.6115 54900 0.1765 -
2.6163 55000 0.1397 0.2103
2.6211 55100 0.1881 -
2.6258 55200 0.1562 -
2.6306 55300 0.1752 -
2.6353 55400 0.2074 -
2.6401 55500 0.1504 0.2098
2.6448 55600 0.1816 -
2.6496 55700 0.1811 -
2.6544 55800 0.1881 -
2.6591 55900 0.2019 -
2.6639 56000 0.2076 0.2097
2.6686 56100 0.2108 -
2.6734 56200 0.2011 -
2.6781 56300 0.1642 -
2.6829 56400 0.2325 -
2.6877 56500 0.1844 0.2069
2.6924 56600 0.1617 -
2.6972 56700 0.1693 -
2.7019 56800 0.1617 -
2.7067 56900 0.197 -
2.7114 57000 0.2182 0.2066
2.7162 57100 0.1724 -
2.7210 57200 0.1773 -
2.7257 57300 0.1532 -
2.7305 57400 0.2125 -
2.7352 57500 0.1384 0.2056
2.7400 57600 0.1366 -
2.7447 57700 0.1943 -
2.7495 57800 0.1869 -
2.7543 57900 0.1785 -
2.7590 58000 0.1752 0.2059
2.7638 58100 0.1643 -
2.7685 58200 0.2154 -
2.7733 58300 0.2041 -
2.7780 58400 0.1911 -
2.7828 58500 0.1547 0.2060
2.7876 58600 0.1314 -
2.7923 58700 0.1906 -
2.7971 58800 0.226 -
2.8018 58900 0.1612 -
2.8066 59000 0.1823 0.2045
2.8113 59100 0.1688 -
2.8161 59200 0.1754 -
2.8209 59300 0.1451 -
2.8256 59400 0.1564 -
2.8304 59500 0.2103 0.2042
2.8351 59600 0.1653 -
2.8399 59700 0.1812 -
2.8446 59800 0.1992 -
2.8494 59900 0.1727 -
2.8542 60000 0.1489 0.2036
2.8589 60100 0.2228 -
2.8637 60200 0.1926 -
2.8684 60300 0.2053 -
2.8732 60400 0.1613 -
2.8779 60500 0.1553 0.2027
2.8827 60600 0.1684 -
2.8875 60700 0.1974 -
2.8922 60800 0.1759 -
2.8970 60900 0.1824 -
2.9017 61000 0.1449 0.2020
2.9065 61100 0.1558 -
2.9112 61200 0.1811 -
2.9160 61300 0.2124 -
2.9207 61400 0.1776 -
2.9255 61500 0.1921 0.2009
2.9303 61600 0.2143 -
2.9350 61700 0.2309 -
2.9398 61800 0.1468 -
2.9445 61900 0.134 -
2.9493 62000 0.1477 0.2009
2.9540 62100 0.1731 -
2.9588 62200 0.1427 -
2.9636 62300 0.1554 -
2.9683 62400 0.1566 -
2.9731 62500 0.1616 0.2011
2.9778 62600 0.1648 -
2.9826 62700 0.2204 -
2.9873 62800 0.149 -
2.9921 62900 0.2051 -
2.9969 63000 0.151 0.2008
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.3
  • Transformers: 4.57.6
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
2
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tss-deposium/bge-m3-matryoshka-1024d

Base model

BAAI/bge-m3
Finetuned
(434)
this model
Quantizations
1 model

Papers for tss-deposium/bge-m3-matryoshka-1024d