CrossEncoder based on BAAI/bge-reranker-v2-m3

This is a Cross Encoder model finetuned from BAAI/bge-reranker-v2-m3 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: BAAI/bge-reranker-v2-m3
  • Maximum Sequence Length: 1024 tokens
  • Number of Output Labels: 1 label

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
    ['Which film whose director was born first, Willy The Private Detective or No.7 Cherry Lane?', 'No.7 Cherry Lane. No.7 Cherry Lane  is a 2019 Hong Kong-Chinese animated film directed by Yonfan, with animation by Zhang Gang. It was selected to compete for the Golden Lion at the 76th Venice International Film Festival. It was also selected for the 2019 Toronto International Film Festival as a Special Presentation. At the Venice Film Festival, the film won the Best Screenplay Award.'],
    ['Do both Jaundya Na Balasaheb and Expecting Love films have the directors from the same country?', 'Joshiy. Joshiy (born 19 July 1952) is an Indian film director from Varkala of Trivandrum in Kerala who works in the Malayalam film industry. He made his debut with "Tiger Salim" (1978) and has directed over 90 films including films with Mammootty and Mohanlal. He has also directed a few Hindi and Tamil films. In the beginning of his career, he received national fame when he directed "Dharm Aur Qanoon" (1984) starring Rajesh Khanna and Dharmendra in the lead roles, with Khanna in double roles.'],
    ['Who is the spouse of the composer of film Saagar (Film)?', 'Ghar Sansar. Ghar Sansar (English: House - Family; Hindi: घर संसार) is a 1986 Indian drama film, produced by Vimal Kumar under the Shivam Chitrya banner and directed by K. Bapaiah. It stars Jeetendra, Sridevi in the lead roles and music composed by Rajesh Roshan. The film is remake of the Telugu movie "Maga Maharaju" (1983), starring Chiranjeevi, Suhasini in the pivotal roles.'],
    ['What is the date of death of the director of film Out Of The Wreck?', "William Desmond Taylor. William Desmond Taylor (born William Cunningham Deane-Tanner, 26 April 1872 – 1 February 1922) was an Anglo-Irish-American director and actor. A popular figure in the growing Hollywood motion picture colony of the 1910s and early 1920s, he directed 59 silent films between 1914 and 1922 and acted in 27 between 1913 and 1915. Taylor's murder on 1 February 1922, along with other Hollywood scandals, such as the Roscoe Arbuckle trial, led to a frenzy of sensationalist and often fabricated newspaper reports. The murder remains an official cold case."],
    ['Are both Charles Liedts and Lea Lublin from the same country?', 'Lea Lublin. Lea Lublin( born 1929, Brest, Poland, died in 1999, Paris, France) was an Argentine- French performance artist. Her involvement with feminist movements and themes included the WACK! Art and the Feminist Revolution in Los Angeles in 2007.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'Which film whose director was born first, Willy The Private Detective or No.7 Cherry Lane?',
    [
        'No.7 Cherry Lane. No.7 Cherry Lane  is a 2019 Hong Kong-Chinese animated film directed by Yonfan, with animation by Zhang Gang. It was selected to compete for the Golden Lion at the 76th Venice International Film Festival. It was also selected for the 2019 Toronto International Film Festival as a Special Presentation. At the Venice Film Festival, the film won the Best Screenplay Award.',
        'Joshiy. Joshiy (born 19 July 1952) is an Indian film director from Varkala of Trivandrum in Kerala who works in the Malayalam film industry. He made his debut with "Tiger Salim" (1978) and has directed over 90 films including films with Mammootty and Mohanlal. He has also directed a few Hindi and Tamil films. In the beginning of his career, he received national fame when he directed "Dharm Aur Qanoon" (1984) starring Rajesh Khanna and Dharmendra in the lead roles, with Khanna in double roles.',
        'Ghar Sansar. Ghar Sansar (English: House - Family; Hindi: घर संसार) is a 1986 Indian drama film, produced by Vimal Kumar under the Shivam Chitrya banner and directed by K. Bapaiah. It stars Jeetendra, Sridevi in the lead roles and music composed by Rajesh Roshan. The film is remake of the Telugu movie "Maga Maharaju" (1983), starring Chiranjeevi, Suhasini in the pivotal roles.',
        "William Desmond Taylor. William Desmond Taylor (born William Cunningham Deane-Tanner, 26 April 1872 – 1 February 1922) was an Anglo-Irish-American director and actor. A popular figure in the growing Hollywood motion picture colony of the 1910s and early 1920s, he directed 59 silent films between 1914 and 1922 and acted in 27 between 1913 and 1915. Taylor's murder on 1 February 1922, along with other Hollywood scandals, such as the Roscoe Arbuckle trial, led to a frenzy of sensationalist and often fabricated newspaper reports. The murder remains an official cold case.",
        'Lea Lublin. Lea Lublin( born 1929, Brest, Poland, died in 1999, Paris, France) was an Argentine- French performance artist. Her involvement with feminist movements and themes included the WACK! Art and the Feminist Revolution in Los Angeles in 2007.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Binary Classification

Metric validation train_subset
accuracy 0.9167 0.896
accuracy_threshold 0.8516 0.8184
f1 0.9145 0.9013
f1_threshold 0.8516 0.375
precision 0.9394 0.8736
recall 0.8908 0.9308
average_precision 0.9575 0.9317

Training Details

Training Dataset

Unnamed Dataset

  • Size: 9,540 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 35 characters
    • mean: 74.6 characters
    • max: 137 characters
    • min: 57 characters
    • mean: 639.95 characters
    • max: 4479 characters
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    Which film whose director was born first, Willy The Private Detective or No.7 Cherry Lane? No.7 Cherry Lane. No.7 Cherry Lane is a 2019 Hong Kong-Chinese animated film directed by Yonfan, with animation by Zhang Gang. It was selected to compete for the Golden Lion at the 76th Venice International Film Festival. It was also selected for the 2019 Toronto International Film Festival as a Special Presentation. At the Venice Film Festival, the film won the Best Screenplay Award. 1.0
    Do both Jaundya Na Balasaheb and Expecting Love films have the directors from the same country? Joshiy. Joshiy (born 19 July 1952) is an Indian film director from Varkala of Trivandrum in Kerala who works in the Malayalam film industry. He made his debut with "Tiger Salim" (1978) and has directed over 90 films including films with Mammootty and Mohanlal. He has also directed a few Hindi and Tamil films. In the beginning of his career, he received national fame when he directed "Dharm Aur Qanoon" (1984) starring Rajesh Khanna and Dharmendra in the lead roles, with Khanna in double roles. 0.0
    Who is the spouse of the composer of film Saagar (Film)? Ghar Sansar. Ghar Sansar (English: House - Family; Hindi: घर संसार) is a 1986 Indian drama film, produced by Vimal Kumar under the Shivam Chitrya banner and directed by K. Bapaiah. It stars Jeetendra, Sridevi in the lead roles and music composed by Rajesh Roshan. The film is remake of the Telugu movie "Maga Maharaju" (1983), starring Chiranjeevi, Suhasini in the pivotal roles. 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss validation_average_precision train_subset_average_precision
0.1048 250 - 0.9151 0.8899
0.2096 500 0.6265 0.9387 0.9006
0.3145 750 - 0.9362 0.8994
0.4193 1000 0.4662 0.9397 0.9053
0.5241 1250 - 0.9482 0.9118
0.6289 1500 0.4724 0.9488 0.9143
0.7338 1750 - 0.9502 0.9147
0.8386 2000 0.4707 0.9509 0.9120
0.9434 2250 - 0.9522 0.9147
1.0 2385 - 0.9552 -
1.0482 2500 0.4326 0.9478 0.9138
1.1530 2750 - 0.9575 0.9317

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 5.2.2
  • Transformers: 4.44.2
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
6
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OloriBern/2wikimultihopqa-hybrid-2000

Finetuned
(71)
this model

Paper for OloriBern/2wikimultihopqa-hybrid-2000

Evaluation results