SentenceTransformer based on BSC-LT/MrBERT-es

This is a sentence-transformers model finetuned from BSC-LT/MrBERT-es. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BSC-LT/MrBERT-es
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Historia La botánica moderna Significado de la botánica como ciencia Los distintos grupos de vegetales participan de manera fundamental en los ciclos de la biosfera.',
    'El COPINH exige a las autoridades judiciales y fiscales proceder judicialmente contra los alcaldes municipales, altos funcionarios de SERNA, y contra las empresas y demás sectores involucrados en esta agresión contra el pueblo lenca.',
    'Durante la transpiración, el sudor elimina el calor del cuerpo humano por evaporación.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.2126, 0.2099],
#         [0.2126, 1.0000, 0.0278],
#         [0.2099, 0.0278, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.4611
spearman_cosine 0.2749

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,175,405 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 5 tokens
    • mean: 37.17 tokens
    • max: 290 tokens
    • min: 5 tokens
    • mean: 38.26 tokens
    • max: 375 tokens
    • min: -0.75
    • mean: 0.17
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    Los ahorros de la jubilación podrán usarse para este fin. Sony Ericsson W8 además de todo eso presenta una pantalla táctil de tipo HVGA de 320 x 480 píxeles y la pantalla posee 16.777.216 colores. 0.2533760964870453
    Programas de desarrollo en el cerebelo La transición célula progenitora a neurona madura, implica una serie de cambios morfológicos y moleculares altamente regulada espacial y temporalmente. Dos ejemplos en los que el principio de exclusión relaciona la materia con la ocupación del espacio son las estrellas enanas blancas y las estrellas de neutrones, que se analizan más adelante. 0.1902337223291397
    Bolsa inmobiliaria online en Distrito Federal df, inmuebles en venta y renta, casas, departamentos, locales, terrenos, inmobiliarias, desarrollos, anunciar inmuebles. Otros prefieren hablar de "régimen" o "sistema feudal", para diferenciarlo sutilmente del feudalismo estricto, o de síntesis feudal, para marcar el hecho de que sobreviven en ella rasgos de la antigüedad clásica mezclados con contribuciones germánicas, implicando tanto a instituciones como a elementos productivos, y significó la especificidad del feudalismo europeo occidental como formación económico social frente a otras también feudales, con consecuencias trascendentales en el futuro devenir histórico. 0.21721388399600983
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • max_grad_norm: 2.0
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 2.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss sts_eval_spearman_cosine
3.9714 583500 0.0253 0.2725
3.9748 584000 0.0274 0.2733
3.9782 584500 0.0279 0.2711
3.9816 585000 0.0248 0.2708
3.9850 585500 0.0264 0.2676
3.9884 586000 0.0267 0.2713
3.9918 586500 0.0276 0.2703
3.9952 587000 0.0273 0.2674
3.9986 587500 0.0278 0.2688
4.0 587704 - 0.2672
4.0020 588000 0.0259 0.2675
4.0054 588500 0.0257 0.2697
4.0088 589000 0.0268 0.2694
4.0122 589500 0.0256 0.2706
4.0156 590000 0.0254 0.2706
4.0190 590500 0.0263 0.2695
4.0224 591000 0.0274 0.2691
4.0258 591500 0.0255 0.2712
4.0292 592000 0.0253 0.2696
4.0326 592500 0.025 0.2692
4.0360 593000 0.0263 0.2679
4.0394 593500 0.028 0.2689
4.0429 594000 0.0275 0.2696
4.0463 594500 0.0268 0.2699
4.0497 595000 0.025 0.2686
4.0531 595500 0.0277 0.2683
4.0565 596000 0.0276 0.2690
4.0599 596500 0.0242 0.2686
4.0633 597000 0.0264 0.2691
4.0667 597500 0.0273 0.2681
4.0701 598000 0.0269 0.2693
4.0735 598500 0.0274 0.2698
4.0769 599000 0.0252 0.2704
4.0803 599500 0.0268 0.2708
4.0837 600000 0.0259 0.2696
4.0871 600500 0.0277 0.2689
4.0905 601000 0.0262 0.2663
4.0939 601500 0.0266 0.2697
4.0973 602000 0.0269 0.2700
4.1007 602500 0.0253 0.2673
4.1041 603000 0.0281 0.2684
4.1075 603500 0.0263 0.2687
4.1109 604000 0.028 0.2677
4.1143 604500 0.0277 0.2701
4.1177 605000 0.0273 0.2686
4.1211 605500 0.0253 0.2681
4.1245 606000 0.0264 0.2694
4.1279 606500 0.0281 0.2706
4.1313 607000 0.0262 0.2714
4.1347 607500 0.0265 0.2673
4.1381 608000 0.0254 0.2685
4.1415 608500 0.0279 0.2674
4.1449 609000 0.0284 0.2692
4.1483 609500 0.0283 0.2680
4.1517 610000 0.0277 0.2673
4.1552 610500 0.0264 0.2692
4.1586 611000 0.0261 0.2687
4.1620 611500 0.0273 0.2697
4.1654 612000 0.027 0.2697
4.1688 612500 0.0274 0.2696
4.1722 613000 0.0273 0.2698
4.1756 613500 0.0255 0.2659
4.1790 614000 0.0274 0.2660
4.1824 614500 0.0284 0.2666
4.1858 615000 0.0268 0.2680
4.1892 615500 0.0278 0.2674
4.1926 616000 0.0276 0.2684
4.1960 616500 0.026 0.2700
4.1994 617000 0.0266 0.2686
4.2028 617500 0.0266 0.2680
4.2062 618000 0.0277 0.2678
4.2096 618500 0.0291 0.2649
4.2130 619000 0.0281 0.2635
4.2164 619500 0.0291 0.2659
4.2198 620000 0.0281 0.2672
4.2232 620500 0.0282 0.2655
4.2266 621000 0.0287 0.2648
4.2300 621500 0.0285 0.2640
4.2334 622000 0.0282 0.2645
4.2368 622500 0.027 0.2674
4.2402 623000 0.0268 0.2669
4.2436 623500 0.0291 0.2663
4.2470 624000 0.0291 0.2645
4.2504 624500 0.0277 0.2677
4.2538 625000 0.0273 0.2631
4.2572 625500 0.0265 0.2653
4.2606 626000 0.0276 0.2665
4.2641 626500 0.027 0.2654
4.2675 627000 0.0271 0.2659
4.2709 627500 0.0279 0.2659
4.2743 628000 0.0274 0.2648
4.2777 628500 0.0263 0.2659
4.2811 629000 0.0279 0.2665
4.2845 629500 0.028 0.2677
4.2879 630000 0.0299 0.2701
4.2913 630500 0.0284 0.2688
4.2947 631000 0.0269 0.2683
4.2981 631500 0.0271 0.2689
4.3015 632000 0.0288 0.2680
4.3049 632500 0.0274 0.2674
4.3083 633000 0.0277 0.2675
4.3117 633500 0.0282 0.2671
4.3151 634000 0.0266 0.2658
4.3185 634500 0.0284 0.2648
4.3219 635000 0.0283 0.2637
4.3253 635500 0.0283 0.2647
4.3287 636000 0.0281 0.2641
4.3321 636500 0.0275 0.2620
4.3355 637000 0.0272 0.2630
4.3389 637500 0.0282 0.2642
4.3423 638000 0.0294 0.2664
4.3457 638500 0.0283 0.2639
4.3491 639000 0.0262 0.2663
4.3525 639500 0.0275 0.2671
4.3559 640000 0.0298 0.2669
4.3593 640500 0.0292 0.2693
4.3627 641000 0.0283 0.2673
4.3661 641500 0.027 0.2687
4.3695 642000 0.0278 0.2663
4.3729 642500 0.0301 0.2652
4.3764 643000 0.0275 0.2676
4.3798 643500 0.0292 0.2680
4.3832 644000 0.0266 0.2680
4.3866 644500 0.0283 0.2668
4.3900 645000 0.0303 0.2677
4.3934 645500 0.0299 0.2701
4.3968 646000 0.0284 0.2680
4.4002 646500 0.0272 0.2664
4.4036 647000 0.0297 0.2662
4.4070 647500 0.029 0.2661
4.4104 648000 0.0281 0.2678
4.4138 648500 0.0282 0.2683
4.4172 649000 0.0278 0.2699
4.4206 649500 0.0309 0.2684
4.4240 650000 0.0288 0.2693
4.4274 650500 0.0307 0.2697
4.4308 651000 0.0272 0.2722
4.4342 651500 0.0289 0.2726
4.4376 652000 0.0288 0.2716
4.4410 652500 0.0289 0.2729
4.4444 653000 0.0297 0.2699
4.4478 653500 0.0286 0.2724
4.4512 654000 0.0298 0.2702
4.4546 654500 0.0302 0.2738
4.4580 655000 0.0292 0.2713
4.4614 655500 0.0297 0.2712
4.4648 656000 0.0286 0.2705
4.4682 656500 0.0285 0.2735
4.4716 657000 0.0294 0.2733
4.4750 657500 0.0291 0.2722
4.4784 658000 0.0283 0.2708
4.4818 658500 0.028 0.2714
4.4853 659000 0.0298 0.2716
4.4887 659500 0.0275 0.2721
4.4921 660000 0.0314 0.2731
4.4955 660500 0.0292 0.2730
4.4989 661000 0.029 0.2749

Framework Versions

  • Python: 3.9.25
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.6
  • PyTorch: 2.6.0+cu118
  • Accelerate: 1.10.1
  • Datasets: 4.5.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
64
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for erickfmm/mrbert-es-sbert-ft

Base model

BSC-LT/MrBERT-es
Finetuned
(2)
this model

Paper for erickfmm/mrbert-es-sbert-ft

Evaluation results