SentenceTransformer based on google-t5/t5-base

This is a sentence-transformers model finetuned from google-t5/t5-base on the all-nli dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: google-t5/t5-base
  • Maximum Sequence Length: None tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': None, 'do_lower_case': False, 'architecture': 'T5EncoderModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'A construction worker peeking out of a manhole while his coworker sits on the sidewalk smiling.',
    'A worker is looking out of a manhole.',
    'The workers are both inside the manhole.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.837 0.8228
spearman_cosine 0.8415 0.8371

Training Details

Training Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 557,850 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 9.96 tokens
    • max: 52 tokens
    • min: 5 tokens
    • mean: 12.79 tokens
    • max: 44 tokens
    • min: 4 tokens
    • mean: 14.02 tokens
    • max: 57 tokens
  • Samples:
    anchor positive negative
    A person on a horse jumps over a broken down airplane. A person is outdoors, on a horse. A person is at a diner, ordering an omelette.
    Children smiling and waving at camera There are children present The kids are frowning
    A boy is jumping on skateboard in the middle of a red bridge. The boy does a skateboarding trick. The boy skates down the sidewalk.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 6,584 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 5 tokens
    • mean: 19.41 tokens
    • max: 79 tokens
    • min: 4 tokens
    • mean: 9.69 tokens
    • max: 35 tokens
    • min: 4 tokens
    • mean: 10.35 tokens
    • max: 30 tokens
  • Samples:
    anchor positive negative
    Two women are embracing while holding to go packages. Two woman are holding packages. The men are fighting outside a deli.
    Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink. Two kids in numbered jerseys wash their hands. Two kids in jackets walk to school.
    A man selling donuts to a customer during a world exhibition event held in the city of Angeles A man selling donuts to a customer. A woman drinks her coffee in a small cafe.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 15
  • warmup_ratio: 0.1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
-1 -1 - - 0.6510 -
0.0287 500 13.7054 7.7586 0.6724 -
0.0574 1000 11.7979 6.2385 0.7213 -
0.0860 1500 9.3927 4.9358 0.7593 -
0.1147 2000 7.613 4.3100 0.7712 -
0.1434 2500 6.557 3.9318 0.7783 -
0.1721 3000 6.0763 3.6271 0.7824 -
0.2008 3500 5.6509 3.3563 0.7849 -
0.2294 4000 5.2428 3.1218 0.7884 -
0.2581 4500 5.0149 2.9553 0.7919 -
0.2868 5000 4.7411 2.8019 0.7949 -
0.3155 5500 4.5017 2.6706 0.7983 -
0.3442 6000 4.288 2.5487 0.8033 -
0.3729 6500 4.2023 2.4345 0.8058 -
0.4015 7000 4.0595 2.3365 0.8075 -
0.4302 7500 3.8702 2.2540 0.8106 -
0.4589 8000 3.6251 2.1699 0.8127 -
0.4876 8500 3.7266 2.1117 0.8157 -
0.5163 9000 3.6235 2.0486 0.8161 -
0.5449 9500 3.4797 1.9871 0.8172 -
0.5736 10000 3.3095 1.9380 0.8197 -
0.6023 10500 3.3159 1.8787 0.8215 -
0.6310 11000 3.1906 1.8673 0.8226 -
0.6597 11500 3.1465 1.8114 0.8241 -
0.6883 12000 3.0597 1.7870 0.8278 -
0.7170 12500 3.027 1.7440 0.8246 -
0.7457 13000 2.9682 1.7153 0.8278 -
0.7744 13500 2.9704 1.6848 0.8234 -
0.8031 14000 2.8845 1.6477 0.8264 -
0.8318 14500 2.8022 1.6315 0.8279 -
0.8604 15000 2.7936 1.6118 0.8299 -
0.8891 15500 2.7953 1.5743 0.8323 -
0.9178 16000 2.7204 1.5756 0.8333 -
0.9465 16500 2.7123 1.5407 0.8316 -
0.9752 17000 2.5954 1.5227 0.8324 -
1.0038 17500 2.5479 1.4977 0.8344 -
1.0325 18000 2.4477 1.5045 0.8341 -
1.0612 18500 2.4471 1.4989 0.8332 -
1.0899 19000 2.4057 1.4664 0.8318 -
1.1186 19500 2.3712 1.4356 0.8313 -
1.1472 20000 2.3994 1.4368 0.8359 -
1.1759 20500 2.3075 1.4233 0.8341 -
1.2046 21000 2.3595 1.4223 0.8329 -
1.2333 21500 2.3321 1.4089 0.8353 -
1.2620 22000 2.2596 1.4086 0.8360 -
1.2907 22500 2.2511 1.4045 0.8392 -
1.3193 23000 2.2308 1.3734 0.8356 -
1.3480 23500 2.2461 1.3764 0.8364 -
1.3767 24000 2.2123 1.3812 0.8361 -
1.4054 24500 2.1176 1.3674 0.8414 -
1.4341 25000 2.1233 1.3669 0.8392 -
1.4627 25500 2.1555 1.3509 0.8378 -
1.4914 26000 2.0954 1.3471 0.8387 -
1.5201 26500 2.0441 1.3314 0.8379 -
1.5488 27000 1.9816 1.3388 0.8417 -
1.5775 27500 2.0938 1.2986 0.8410 -
1.6061 28000 2.0493 1.3159 0.8410 -
1.6348 28500 2.0017 1.3051 0.8387 -
1.6635 29000 1.9539 1.3239 0.8404 -
1.6922 29500 1.9492 1.2953 0.8423 -
1.7209 30000 2.0114 1.3136 0.8428 -
1.7496 30500 1.9353 1.2928 0.8435 -
1.7782 31000 1.9298 1.2709 0.8430 -
1.8069 31500 1.97 1.2791 0.8443 -
1.8356 32000 1.8782 1.2811 0.8424 -
1.8643 32500 1.8681 1.2436 0.8457 -
1.8930 33000 1.877 1.2562 0.8449 -
1.9216 33500 1.779 1.2503 0.8438 -
1.9503 34000 1.8633 1.2542 0.8453 -
1.9790 34500 1.8178 1.2455 0.8436 -
2.0077 35000 1.7697 1.2653 0.8446 -
2.0364 35500 1.6046 1.2587 0.8444 -
2.0650 36000 1.6431 1.2484 0.8480 -
2.0937 36500 1.6234 1.2426 0.8461 -
2.1224 37000 1.6287 1.2429 0.8482 -
2.1511 37500 1.6215 1.2468 0.8408 -
2.1798 38000 1.6565 1.2430 0.8429 -
2.2085 38500 1.5941 1.2227 0.8435 -
2.2371 39000 1.5237 1.2610 0.8412 -
2.2658 39500 1.6119 1.2515 0.8458 -
2.2945 40000 1.5083 1.2476 0.8413 -
2.3232 40500 1.5637 1.2311 0.8463 -
2.3519 41000 1.542 1.2318 0.8457 -
2.3805 41500 1.5675 1.2317 0.8444 -
2.4092 42000 1.5088 1.2334 0.8417 -
2.4379 42500 1.5145 1.2223 0.8430 -
2.4666 43000 1.5462 1.2253 0.8451 -
2.4953 43500 1.5305 1.2075 0.8440 -
2.5239 44000 1.4976 1.2181 0.8432 -
2.5526 44500 1.5038 1.2231 0.8463 -
2.5813 45000 1.531 1.2001 0.8486 -
2.6100 45500 1.4704 1.2016 0.8471 -
2.6387 46000 1.4932 1.1985 0.8463 -
2.6674 46500 1.4448 1.1928 0.8507 -
2.6960 47000 1.4798 1.1888 0.8471 -
2.7247 47500 1.476 1.1941 0.8449 -
2.7534 48000 1.4435 1.2013 0.8453 -
2.7821 48500 1.4087 1.2048 0.8443 -
2.8108 49000 1.4194 1.2043 0.8425 -
2.8394 49500 1.3939 1.1936 0.8440 -
2.8681 50000 1.4054 1.1815 0.8447 -
2.8968 50500 1.4397 1.1854 0.8435 -
2.9255 51000 1.4404 1.1609 0.8463 -
2.9542 51500 1.4051 1.1713 0.8465 -
2.9828 52000 1.3619 1.1693 0.8444 -
3.0115 52500 1.3274 1.1721 0.8461 -
3.0402 53000 1.2563 1.1841 0.8453 -
3.0689 53500 1.2658 1.1925 0.8459 -
3.0976 54000 1.2891 1.1791 0.8458 -
3.1263 54500 1.2996 1.1732 0.8424 -
3.1549 55000 1.2108 1.1987 0.8455 -
3.1836 55500 1.2212 1.2029 0.8490 -
3.2123 56000 1.2053 1.1822 0.8476 -
3.2410 56500 1.2452 1.1831 0.8472 -
3.2697 57000 1.2097 1.1640 0.8477 -
3.2983 57500 1.1934 1.1786 0.8509 -
3.3270 58000 1.2028 1.1872 0.8474 -
3.3557 58500 1.223 1.1894 0.8518 -
3.3844 59000 1.1942 1.1870 0.8513 -
3.4131 59500 1.2141 1.1943 0.8486 -
3.4417 60000 1.1649 1.1669 0.8469 -
3.4704 60500 1.18 1.1944 0.8475 -
3.4991 61000 1.1872 1.1915 0.8473 -
3.5278 61500 1.1397 1.1998 0.8462 -
3.5565 62000 1.1865 1.1862 0.8466 -
3.5852 62500 1.2027 1.1875 0.8460 -
3.6138 63000 1.1526 1.1791 0.8486 -
3.6425 63500 1.251 1.1825 0.8500 -
3.6712 64000 1.1684 1.1869 0.8463 -
3.6999 64500 1.1748 1.1939 0.8475 -
3.7286 65000 1.1585 1.1884 0.8470 -
3.7572 65500 1.1963 1.1739 0.8484 -
3.7859 66000 1.1937 1.1913 0.8467 -
3.8146 66500 1.166 1.1769 0.8475 -
3.8433 67000 1.1778 1.2029 0.8477 -
3.8720 67500 1.1213 1.1913 0.8493 -
3.9006 68000 1.1564 1.1943 0.8503 -
3.9293 68500 1.1333 1.1959 0.8457 -
3.9580 69000 1.1264 1.1895 0.8460 -
3.9867 69500 1.1383 1.1843 0.8476 -
4.0154 70000 1.0941 1.2052 0.8499 -
4.0441 70500 1.0238 1.1999 0.8479 -
4.0727 71000 0.9976 1.1865 0.8466 -
4.1014 71500 0.9958 1.2058 0.8466 -
4.1301 72000 1.0001 1.2044 0.8437 -
4.1588 72500 1.0617 1.2127 0.8425 -
4.1875 73000 1.0097 1.2046 0.8444 -
4.2161 73500 1.009 1.1841 0.8481 -
4.2448 74000 0.9992 1.2006 0.8466 -
4.2735 74500 0.9998 1.1929 0.8486 -
4.3022 75000 0.9653 1.1914 0.8480 -
4.3309 75500 0.9979 1.2030 0.8460 -
4.3595 76000 1.0385 1.2145 0.8462 -
4.3882 76500 1.0029 1.2264 0.8457 -
4.4169 77000 1.0097 1.2083 0.8458 -
4.4456 77500 0.9866 1.2104 0.8443 -
4.4743 78000 0.9902 1.1933 0.8476 -
4.5030 78500 1.0187 1.2064 0.8476 -
4.5316 79000 0.9883 1.2011 0.8457 -
4.5603 79500 0.9713 1.2051 0.8465 -
4.5890 80000 0.9886 1.2130 0.8464 -
4.6177 80500 1.0087 1.1961 0.8478 -
4.6464 81000 1.0224 1.1936 0.8485 -
4.6750 81500 0.9941 1.1944 0.8455 -
4.7037 82000 1.0294 1.1818 0.8467 -
4.7324 82500 0.9702 1.2070 0.8453 -
4.7611 83000 0.9772 1.1966 0.8440 -
4.7898 83500 0.9938 1.2063 0.8446 -
4.8184 84000 0.9935 1.1937 0.8472 -
4.8471 84500 1.0016 1.1847 0.8469 -
4.8758 85000 0.9872 1.1874 0.8464 -
4.9045 85500 0.9903 1.1877 0.8454 -
4.9332 86000 0.9963 1.1873 0.8415 -
4.9619 86500 0.9641 1.1748 0.8469 -
4.9905 87000 0.9721 1.1756 0.8479 -
5.0192 87500 0.9047 1.1986 0.8466 -
5.0479 88000 0.8849 1.2033 0.8469 -
5.0766 88500 0.8513 1.2142 0.8458 -
5.1053 89000 0.8656 1.1917 0.8438 -
5.1339 89500 0.8811 1.1839 0.8455 -
5.1626 90000 0.8834 1.1988 0.8416 -
5.1913 90500 0.9094 1.2037 0.8450 -
5.2200 91000 0.881 1.2078 0.8413 -
5.2487 91500 0.8506 1.2018 0.8395 -
5.2773 92000 0.9125 1.1737 0.8417 -
5.3060 92500 0.8409 1.1786 0.8429 -
5.3347 93000 0.89 1.1965 0.8436 -
5.3634 93500 0.8807 1.1873 0.8444 -
5.3921 94000 0.9001 1.1792 0.8477 -
5.4208 94500 0.8712 1.1922 0.8460 -
5.4494 95000 0.868 1.1855 0.8465 -
5.4781 95500 0.8599 1.1953 0.8426 -
5.5068 96000 0.8659 1.1800 0.8467 -
5.5355 96500 0.901 1.1805 0.8443 -
5.5642 97000 0.8289 1.1717 0.8449 -
5.5928 97500 0.8626 1.1812 0.8440 -
5.6215 98000 0.8387 1.1620 0.8472 -
5.6502 98500 0.8557 1.1752 0.8457 -
5.6789 99000 0.8474 1.1773 0.8434 -
5.7076 99500 0.8618 1.1762 0.8454 -
5.7362 100000 0.8882 1.1827 0.8441 -
5.7649 100500 0.822 1.1829 0.8459 -
5.7936 101000 0.8446 1.1897 0.8453 -
5.8223 101500 0.853 1.1965 0.8440 -
5.8510 102000 0.8437 1.1774 0.8440 -
5.8797 102500 0.8327 1.1791 0.8430 -
5.9083 103000 0.8639 1.1882 0.8425 -
5.9370 103500 0.8366 1.1775 0.8444 -
5.9657 104000 0.871 1.1592 0.8453 -
5.9944 104500 0.865 1.1832 0.8445 -
6.0231 105000 0.8127 1.1995 0.8452 -
6.0517 105500 0.7883 1.1890 0.8452 -
6.0804 106000 0.7288 1.1876 0.8456 -
6.1091 106500 0.7613 1.1993 0.8436 -
6.1378 107000 0.7386 1.2039 0.8465 -
6.1665 107500 0.7423 1.2102 0.8446 -
6.1951 108000 0.7775 1.2091 0.8444 -
6.2238 108500 0.7993 1.1935 0.8462 -
6.2525 109000 0.7818 1.1721 0.8480 -
6.2812 109500 0.8045 1.1804 0.8486 -
6.3099 110000 0.7896 1.1871 0.8478 -
6.3386 110500 0.7857 1.1764 0.8484 -
6.3672 111000 0.7231 1.1982 0.8470 -
6.3959 111500 0.7235 1.2115 0.8449 -
6.4246 112000 0.7556 1.1955 0.8462 -
6.4533 112500 0.7522 1.2030 0.8462 -
6.4820 113000 0.7573 1.1882 0.8470 -
6.5106 113500 0.7465 1.1933 0.8455 -
6.5393 114000 0.7676 1.1824 0.8458 -
6.5680 114500 0.7506 1.1958 0.8449 -
6.5967 115000 0.7884 1.2120 0.8446 -
6.6254 115500 0.7689 1.1832 0.8468 -
6.6540 116000 0.7725 1.1856 0.8470 -
6.6827 116500 0.7775 1.2023 0.8440 -
6.7114 117000 0.7218 1.1993 0.8447 -
6.7401 117500 0.7525 1.1952 0.8456 -
6.7688 118000 0.7714 1.1883 0.8461 -
6.7975 118500 0.7619 1.1996 0.8485 -
6.8261 119000 0.7719 1.2016 0.8458 -
6.8548 119500 0.7259 1.2038 0.8448 -
6.8835 120000 0.7315 1.2028 0.8455 -
6.9122 120500 0.7397 1.1850 0.8468 -
6.9409 121000 0.7862 1.1997 0.8446 -
6.9695 121500 0.7506 1.1941 0.8459 -
6.9982 122000 0.7272 1.1990 0.8477 -
7.0269 122500 0.7274 1.2014 0.8446 -
7.0556 123000 0.7318 1.2011 0.8450 -
7.0843 123500 0.6754 1.2200 0.8439 -
7.1129 124000 0.6767 1.2114 0.8444 -
7.1416 124500 0.7088 1.2152 0.8447 -
7.1703 125000 0.6977 1.2049 0.8442 -
7.1990 125500 0.6797 1.2232 0.8420 -
7.2277 126000 0.7212 1.2179 0.8436 -
7.2564 126500 0.6684 1.2094 0.8431 -
7.2850 127000 0.6796 1.2123 0.8419 -
7.3137 127500 0.685 1.2078 0.8424 -
7.3424 128000 0.6764 1.2067 0.8431 -
7.3711 128500 0.7198 1.2105 0.8424 -
7.3998 129000 0.6858 1.2232 0.8437 -
7.4284 129500 0.6916 1.2149 0.8458 -
7.4571 130000 0.7041 1.2160 0.8423 -
7.4858 130500 0.6931 1.2026 0.8425 -
7.5145 131000 0.6972 1.2106 0.8471 -
7.5432 131500 0.6767 1.2000 0.8473 -
7.5718 132000 0.6716 1.2099 0.8448 -
7.6005 132500 0.6833 1.2123 0.8437 -
7.6292 133000 0.6985 1.1939 0.8447 -
7.6579 133500 0.6839 1.2256 0.8435 -
7.6866 134000 0.7139 1.2038 0.8444 -
7.7153 134500 0.6885 1.2148 0.8452 -
7.7439 135000 0.6889 1.1966 0.8448 -
7.7726 135500 0.687 1.2092 0.8437 -
7.8013 136000 0.686 1.2168 0.8419 -
7.8300 136500 0.6783 1.2063 0.8443 -
7.8587 137000 0.6595 1.2156 0.8428 -
7.8873 137500 0.6661 1.2139 0.8431 -
7.9160 138000 0.6771 1.2171 0.8464 -
7.9447 138500 0.6628 1.2014 0.8448 -
7.9734 139000 0.6741 1.2073 0.8459 -
8.0021 139500 0.6697 1.2024 0.8425 -
8.0307 140000 0.605 1.2185 0.8447 -
8.0594 140500 0.636 1.2103 0.8428 -
8.0881 141000 0.6362 1.2035 0.8443 -
8.1168 141500 0.651 1.2027 0.8447 -
8.1455 142000 0.6309 1.2105 0.8447 -
8.1742 142500 0.6158 1.2106 0.8449 -
8.2028 143000 0.6271 1.2248 0.8449 -
8.2315 143500 0.6515 1.2267 0.8442 -
8.2602 144000 0.6574 1.2308 0.8439 -
8.2889 144500 0.6476 1.2315 0.8434 -
8.3176 145000 0.6342 1.2129 0.8441 -
8.3462 145500 0.6062 1.2196 0.8426 -
8.3749 146000 0.6126 1.2069 0.8432 -
8.4036 146500 0.6119 1.2061 0.8417 -
8.4323 147000 0.6237 1.2002 0.8433 -
8.4610 147500 0.6438 1.1901 0.8439 -
8.4896 148000 0.6647 1.1996 0.8449 -
8.5183 148500 0.6446 1.2100 0.8437 -
8.5470 149000 0.6431 1.1996 0.8444 -
8.5757 149500 0.6369 1.1980 0.8462 -
8.6044 150000 0.6087 1.2008 0.8473 -
8.6331 150500 0.6462 1.1958 0.8458 -
8.6617 151000 0.6356 1.2026 0.8444 -
8.6904 151500 0.6266 1.1928 0.8466 -
8.7191 152000 0.6197 1.2158 0.8454 -
8.7478 152500 0.624 1.1941 0.8461 -
8.7765 153000 0.5935 1.1846 0.8462 -
8.8051 153500 0.611 1.1844 0.8449 -
8.8338 154000 0.633 1.1973 0.8448 -
8.8625 154500 0.6102 1.2033 0.8456 -
8.8912 155000 0.6141 1.1977 0.8466 -
8.9199 155500 0.6321 1.1983 0.8468 -
8.9485 156000 0.6417 1.2106 0.8473 -
8.9772 156500 0.633 1.2050 0.8478 -
9.0059 157000 0.6146 1.2139 0.8471 -
9.0346 157500 0.5796 1.2246 0.8460 -
9.0633 158000 0.5789 1.2143 0.8481 -
9.0920 158500 0.5473 1.2287 0.8447 -
9.1206 159000 0.5577 1.2086 0.8464 -
9.1493 159500 0.584 1.1974 0.8473 -
9.1780 160000 0.5855 1.2095 0.8462 -
9.2067 160500 0.6077 1.2157 0.8448 -
9.2354 161000 0.5768 1.2164 0.8445 -
9.2640 161500 0.5727 1.2092 0.8471 -
9.2927 162000 0.5764 1.2107 0.8454 -
9.3214 162500 0.5563 1.2206 0.8459 -
9.3501 163000 0.5647 1.2141 0.8457 -
9.3788 163500 0.5838 1.2184 0.8452 -
9.4074 164000 0.5802 1.2129 0.8460 -
9.4361 164500 0.5731 1.2018 0.8468 -
9.4648 165000 0.5648 1.2125 0.8470 -
9.4935 165500 0.596 1.2174 0.8460 -
9.5222 166000 0.5734 1.2217 0.8444 -
9.5509 166500 0.5799 1.2303 0.8461 -
9.5795 167000 0.5705 1.2136 0.8462 -
9.6082 167500 0.5778 1.2268 0.8440 -
9.6369 168000 0.5804 1.2191 0.8466 -
9.6656 168500 0.551 1.2403 0.8454 -
9.6943 169000 0.5605 1.2146 0.8455 -
9.7229 169500 0.5887 1.2313 0.8427 -
9.7516 170000 0.5599 1.2298 0.8428 -
9.7803 170500 0.6183 1.2214 0.8448 -
9.8090 171000 0.5849 1.2299 0.8429 -
9.8377 171500 0.5769 1.2180 0.8452 -
9.8663 172000 0.588 1.2172 0.8443 -
9.8950 172500 0.5806 1.2187 0.8458 -
9.9237 173000 0.5743 1.2181 0.8452 -
9.9524 173500 0.5799 1.2217 0.8442 -
9.9811 174000 0.5794 1.2267 0.8439 -
10.0098 174500 0.563 1.2144 0.8451 -
10.0384 175000 0.5374 1.2289 0.8456 -
10.0671 175500 0.5342 1.2313 0.8456 -
10.0958 176000 0.5476 1.2259 0.8458 -
10.1245 176500 0.5493 1.2312 0.8441 -
10.1532 177000 0.5594 1.2215 0.8454 -
10.1818 177500 0.529 1.2212 0.8444 -
10.2105 178000 0.5522 1.2428 0.8439 -
10.2392 178500 0.5258 1.2259 0.8441 -
10.2679 179000 0.5485 1.2457 0.8429 -
10.2966 179500 0.5422 1.2239 0.8448 -
10.3252 180000 0.5489 1.2220 0.8441 -
10.3539 180500 0.5648 1.2257 0.8440 -
10.3826 181000 0.548 1.2227 0.8453 -
10.4113 181500 0.5448 1.2222 0.8453 -
10.4400 182000 0.5702 1.2268 0.8441 -
10.4687 182500 0.5389 1.2345 0.8447 -
10.4973 183000 0.5197 1.2455 0.8433 -
10.5260 183500 0.5394 1.2252 0.8447 -
10.5547 184000 0.5457 1.2319 0.8435 -
10.5834 184500 0.5276 1.2284 0.8442 -
10.6121 185000 0.5505 1.2186 0.8441 -
10.6407 185500 0.5633 1.2376 0.8435 -
10.6694 186000 0.5206 1.2274 0.8435 -
10.6981 186500 0.5416 1.2286 0.8427 -
10.7268 187000 0.5348 1.2263 0.8442 -
10.7555 187500 0.5155 1.2410 0.8433 -
10.7841 188000 0.5667 1.2273 0.8419 -
10.8128 188500 0.5106 1.2316 0.8419 -
10.8415 189000 0.5758 1.2288 0.8424 -
10.8702 189500 0.5472 1.2311 0.8411 -
10.8989 190000 0.5218 1.2360 0.8424 -
10.9276 190500 0.5455 1.2326 0.8424 -
10.9562 191000 0.5695 1.2229 0.8420 -
10.9849 191500 0.5676 1.2150 0.8420 -
11.0136 192000 0.5273 1.2110 0.8427 -
11.0423 192500 0.5056 1.2262 0.8419 -
11.0710 193000 0.5169 1.2328 0.8417 -
11.0996 193500 0.4961 1.2332 0.8414 -
11.1283 194000 0.5023 1.2248 0.8423 -
11.1570 194500 0.4722 1.2368 0.8420 -
11.1857 195000 0.5292 1.2343 0.8414 -
11.2144 195500 0.5204 1.2507 0.8408 -
11.2430 196000 0.5088 1.2359 0.8411 -
11.2717 196500 0.5126 1.2394 0.8409 -
11.3004 197000 0.5177 1.2308 0.8418 -
11.3291 197500 0.5136 1.2412 0.8399 -
11.3578 198000 0.5361 1.2284 0.8412 -
11.3865 198500 0.5146 1.2319 0.8411 -
11.4151 199000 0.5425 1.2229 0.8425 -
11.4438 199500 0.5187 1.2198 0.8437 -
11.4725 200000 0.5153 1.2227 0.8421 -
11.5012 200500 0.5193 1.2299 0.8416 -
11.5299 201000 0.5176 1.2376 0.8403 -
11.5585 201500 0.5416 1.2216 0.8410 -
11.5872 202000 0.5325 1.2294 0.8399 -
11.6159 202500 0.5169 1.2298 0.8415 -
11.6446 203000 0.4998 1.2369 0.8411 -
11.6733 203500 0.5112 1.2345 0.8410 -
11.7019 204000 0.5047 1.2262 0.8415 -
11.7306 204500 0.5253 1.2218 0.8420 -
11.7593 205000 0.5236 1.2192 0.8410 -
11.7880 205500 0.5 1.2148 0.8414 -
11.8167 206000 0.4913 1.2238 0.8402 -
11.8454 206500 0.5343 1.2173 0.8408 -
11.8740 207000 0.5094 1.2316 0.8398 -
11.9027 207500 0.4865 1.2267 0.8394 -
11.9314 208000 0.5239 1.2145 0.8407 -
11.9601 208500 0.5157 1.2215 0.8411 -
11.9888 209000 0.4732 1.2233 0.8409 -
12.0174 209500 0.4794 1.2155 0.8417 -
12.0461 210000 0.4855 1.2268 0.8407 -
12.0748 210500 0.4831 1.2297 0.8404 -
12.1035 211000 0.4982 1.2319 0.8406 -
12.1322 211500 0.4749 1.2326 0.8406 -
12.1608 212000 0.4656 1.2318 0.8399 -
12.1895 212500 0.4996 1.2331 0.8407 -
12.2182 213000 0.4796 1.2302 0.8413 -
12.2469 213500 0.4717 1.2317 0.8406 -
12.2756 214000 0.5028 1.2264 0.8407 -
12.3043 214500 0.4666 1.2318 0.8404 -
12.3329 215000 0.4828 1.2168 0.8412 -
12.3616 215500 0.5181 1.2186 0.8414 -
12.3903 216000 0.4686 1.2284 0.8402 -
12.4190 216500 0.5063 1.2253 0.8406 -
12.4477 217000 0.496 1.2342 0.8398 -
12.4763 217500 0.4975 1.2233 0.8416 -
12.5050 218000 0.5064 1.2148 0.8421 -
12.5337 218500 0.5004 1.2198 0.8424 -
12.5624 219000 0.4696 1.2321 0.8407 -
12.5911 219500 0.477 1.2273 0.8413 -
12.6197 220000 0.5117 1.2308 0.8411 -
12.6484 220500 0.4656 1.2417 0.8402 -
12.6771 221000 0.5057 1.2320 0.8407 -
12.7058 221500 0.4981 1.2357 0.8410 -
12.7345 222000 0.4893 1.2335 0.8412 -
12.7632 222500 0.4716 1.2356 0.8404 -
12.7918 223000 0.5005 1.2284 0.8403 -
12.8205 223500 0.4746 1.2201 0.8412 -
12.8492 224000 0.4787 1.2265 0.8409 -
12.8779 224500 0.4631 1.2308 0.8405 -
12.9066 225000 0.4914 1.2304 0.8405 -
12.9352 225500 0.4973 1.2342 0.8401 -
12.9639 226000 0.4872 1.2269 0.8405 -
12.9926 226500 0.5194 1.2193 0.8412 -
13.0213 227000 0.4625 1.2227 0.8413 -
13.0500 227500 0.4563 1.2189 0.8422 -
13.0786 228000 0.4786 1.2285 0.8411 -
13.1073 228500 0.4892 1.2301 0.8409 -
13.1360 229000 0.4574 1.2321 0.8409 -
13.1647 229500 0.4899 1.2407 0.8407 -
13.1934 230000 0.4645 1.2298 0.8407 -
13.2221 230500 0.4826 1.2382 0.8402 -
13.2507 231000 0.4573 1.2377 0.8404 -
13.2794 231500 0.4593 1.2323 0.8404 -
13.3081 232000 0.464 1.2295 0.8400 -
13.3368 232500 0.4869 1.2381 0.8392 -
13.3655 233000 0.454 1.2313 0.8395 -
13.3941 233500 0.4728 1.2321 0.8399 -
13.4228 234000 0.4427 1.2359 0.8399 -
13.4515 234500 0.4883 1.2376 0.8402 -
13.4802 235000 0.4656 1.2302 0.8409 -
13.5089 235500 0.4749 1.2296 0.8416 -
13.5375 236000 0.4681 1.2236 0.8417 -
13.5662 236500 0.4851 1.2327 0.8409 -
13.5949 237000 0.4782 1.2282 0.8416 -
13.6236 237500 0.4694 1.2285 0.8413 -
13.6523 238000 0.481 1.2297 0.8408 -
13.6809 238500 0.4774 1.2306 0.8400 -
13.7096 239000 0.4694 1.2333 0.8404 -
13.7383 239500 0.4562 1.2344 0.8412 -
13.7670 240000 0.4607 1.2269 0.8418 -
13.7957 240500 0.4912 1.2231 0.8413 -
13.8244 241000 0.4753 1.2216 0.8414 -
13.8530 241500 0.475 1.2217 0.8418 -
13.8817 242000 0.4763 1.2276 0.8411 -
13.9104 242500 0.4877 1.2259 0.8417 -
13.9391 243000 0.4727 1.2313 0.8412 -
13.9678 243500 0.4652 1.2286 0.8416 -
13.9964 244000 0.4738 1.2312 0.8413 -
14.0251 244500 0.4353 1.2315 0.8416 -
14.0538 245000 0.4911 1.2298 0.8413 -
14.0825 245500 0.4633 1.2296 0.8411 -
14.1112 246000 0.4806 1.2272 0.8414 -
14.1398 246500 0.4561 1.2291 0.8412 -
14.1685 247000 0.4335 1.2299 0.8410 -
14.1972 247500 0.4649 1.2248 0.8414 -
14.2259 248000 0.479 1.2258 0.8418 -
14.2546 248500 0.4417 1.2235 0.8419 -
14.2833 249000 0.474 1.2257 0.8415 -
14.3119 249500 0.4711 1.2274 0.8417 -
14.3406 250000 0.4578 1.2263 0.8416 -
14.3693 250500 0.4607 1.2278 0.8412 -
14.3980 251000 0.4519 1.2306 0.8411 -
14.4267 251500 0.4876 1.2316 0.8411 -
14.4553 252000 0.455 1.2352 0.8409 -
14.4840 252500 0.4374 1.2352 0.8413 -
14.5127 253000 0.432 1.2323 0.8416 -
14.5414 253500 0.4563 1.2323 0.8415 -
14.5701 254000 0.4508 1.2337 0.8412 -
14.5987 254500 0.441 1.2351 0.8411 -
14.6274 255000 0.4861 1.2340 0.8414 -
14.6561 255500 0.4625 1.2340 0.8413 -
14.6848 256000 0.4847 1.2330 0.8412 -
14.7135 256500 0.47 1.2323 0.8411 -
14.7422 257000 0.4438 1.2322 0.8411 -
14.7708 257500 0.4604 1.2315 0.8413 -
14.7995 258000 0.4509 1.2318 0.8413 -
14.8282 258500 0.4349 1.2309 0.8414 -
14.8569 259000 0.453 1.2309 0.8415 -
14.8856 259500 0.4465 1.2316 0.8415 -
14.9142 260000 0.4532 1.2319 0.8414 -
14.9429 260500 0.4811 1.2321 0.8415 -
14.9716 261000 0.4826 1.2323 0.8415 -
-1 -1 - - - 0.8371

Framework Versions

  • Python: 3.13.0
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
3
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sobamchan/t5-base-mrl-768-512-256-128-64

Finetuned
(730)
this model

Dataset used to train sobamchan/t5-base-mrl-768-512-256-128-64

Papers for sobamchan/t5-base-mrl-768-512-256-128-64

Evaluation results