ceat-fc-rag

This is a sentence-transformers model finetuned from nlpaueb/legal-bert-base-uncased on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: nlpaueb/legal-bert-base-uncased
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sirtobsi/ceat-fc-rag-2")
# Run inference
sentences = [
    'e the type, age, and efficiency of the generator. 259. As is obvious, and as Mercer’s expert Mr. Switlishoff confirms, the guidelines “too vague and general either to enable the calculation of any new GBL based on actual data, the validation of any GBL BC Hydro previously computed .... They provide no uniform methodology that two engineers, lawyers, accountants, or any other professional could apply given set of facts and arrive at the same resulting GBL.”* Indeed, as noted earlier, on December 13, 2013, the BCUC expressed its own concerns regarding the GBL Guidelines, which BC Hydro and FortisBC had proposed to adopt in their 2013 PPA. The Commission observed that the guidelines “are fairly general, subject to considerable interpretation, not necessarily transparent and have not been approved by the Commission.”*”° 8. The BC Pulp and Paper Joint Task Force and the BC Working Group on Pulp & Paper Self-Generation Sales Policy 260. The BC Government, which, as noted, has not promulgated any laws, regulations, or binding rule of law addressing a self-generator’s access to embedded cost power while selling electricity not in excess of load, however, did communicate its perspective to part of the industry. 261. In March 2007, the BC Pulp and Paper Joint Task Force was created as a joint initiative between the BC Pulp and Paper Industry and the BC government with the overarching purpose of working on policy and hosting conditions for the long term sustainability of the industry.°* In November 2007, the Task Force submitted a position paper to the BC government 30? Switlishoff Expert Statement, § 58. 303 C-27, Letter from Erica Hamilton, Commission Secretary, to Janet Fraser, Chief Regulatory Officer, BC Hydro (13 December 2013) (Exhibit A-17 to BC Hydro PPA - RS 3808, TS No. 2 & 3 Proceeding). 304 Gandossi Witness Statement, 54. - 109 - are or to a',
    "Mr. Gandossi, regarding the recommendations made by the BC Pulp and Paper Task Force, could you explain the impact these had on BC Hydro's pricing strategy?\nCertainly. The Task Force recommended that BC Hydro purchase all self-generated electricity from pulp and paper mills at the Tier 2 Transmission Service Rate. This recommendation was actually implemented by the government, leading to increased revenue for both the sector and BC Hydro.\nJust to clarify, you are claiming the government adopted the Task Force’s pricing recommendations, correct?\nYes, they saw it as a way to boost the industry while securing additional electricity for the province.\nInteresting. Can you describe how this aligns with the GBL guidelines set by BC Hydro and the concerns raised by the BCUC?\nThe guidelines provided clear methods to calculate GBLs, which were consistent with the Task Force’s recommendations. Consequently, the BCUC did not express any significant concerns about these policies.",
    "Mr. Smith, could you explain why BC Hydro was opposed to FortisBC facilitating energy export arbitrage?\nCertainly. BC Hydro was concerned because facilitating such arbitrage would lead to minor inconvenience for their operations, but overall they would not face financial losses. They believed it wouldn’t harm their ratepayers significantly.\nIsn't it true that BC Hydro actually argued that they would incur losses if FortisBC facilitated such activities?\nWell, they might have suggested potential risks, but they didn't provide strong evidence that these actions would necessarily result in substantial losses.\nCould you clarify what was meant by 'embedded costs' in relation to the energy transactions between FortisBC and Celgar Mill?\nEmbedded costs refer to the theoretical total cost of resources, which was a priority in setting rates for utility services related to newer and more efficient assets only.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9697, 0.9441],
#         [0.9697, 1.0000, 0.9581],
#         [0.9441, 0.9581, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.0533
cosine_accuracy@3 0.1429
cosine_accuracy@5 0.1924
cosine_accuracy@10 0.2762
cosine_precision@1 0.0533
cosine_precision@3 0.0476
cosine_precision@5 0.0385
cosine_precision@10 0.0276
cosine_recall@1 0.0533
cosine_recall@3 0.1429
cosine_recall@5 0.1924
cosine_recall@10 0.2762
cosine_ndcg@10 0.1532
cosine_mrr@10 0.1156
cosine_map@100 0.1265

Information Retrieval

Metric Value
cosine_accuracy@1 0.0571
cosine_accuracy@3 0.1371
cosine_accuracy@5 0.1924
cosine_accuracy@10 0.259
cosine_precision@1 0.0571
cosine_precision@3 0.0457
cosine_precision@5 0.0385
cosine_precision@10 0.0259
cosine_recall@1 0.0571
cosine_recall@3 0.1371
cosine_recall@5 0.1924
cosine_recall@10 0.259
cosine_ndcg@10 0.1482
cosine_mrr@10 0.114
cosine_map@100 0.1251

Information Retrieval

Metric Value
cosine_accuracy@1 0.0648
cosine_accuracy@3 0.1562
cosine_accuracy@5 0.2019
cosine_accuracy@10 0.2838
cosine_precision@1 0.0648
cosine_precision@3 0.0521
cosine_precision@5 0.0404
cosine_precision@10 0.0284
cosine_recall@1 0.0648
cosine_recall@3 0.1562
cosine_recall@5 0.2019
cosine_recall@10 0.2838
cosine_ndcg@10 0.1637
cosine_mrr@10 0.1265
cosine_map@100 0.1366

Information Retrieval

Metric Value
cosine_accuracy@1 0.0533
cosine_accuracy@3 0.139
cosine_accuracy@5 0.1771
cosine_accuracy@10 0.2743
cosine_precision@1 0.0533
cosine_precision@3 0.0463
cosine_precision@5 0.0354
cosine_precision@10 0.0274
cosine_recall@1 0.0533
cosine_recall@3 0.139
cosine_recall@5 0.1771
cosine_recall@10 0.2743
cosine_ndcg@10 0.1494
cosine_mrr@10 0.1114
cosine_map@100 0.1208

Information Retrieval

Metric Value
cosine_accuracy@1 0.0476
cosine_accuracy@3 0.1257
cosine_accuracy@5 0.1733
cosine_accuracy@10 0.2552
cosine_precision@1 0.0476
cosine_precision@3 0.0419
cosine_precision@5 0.0347
cosine_precision@10 0.0255
cosine_recall@1 0.0476
cosine_recall@3 0.1257
cosine_recall@5 0.1733
cosine_recall@10 0.2552
cosine_ndcg@10 0.1383
cosine_mrr@10 0.1024
cosine_map@100 0.1113

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 786 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 786 samples:
    positive anchor
    type string string
    details
    • min: 85 tokens
    • mean: 435.81 tokens
    • max: 512 tokens
    • min: 117 tokens
    • mean: 221.33 tokens
    • max: 378 tokens
  • Samples:
    positive anchor
    Electricity and Alternative Energy Division, Ministry of Energy, from 2003 to 2008. 4. I also served as Chair of the British Columbia Utilities Commission (“BCUC”) from 1998 to 2003. 5. In early 1990, I held the position of Manager, Regulated Projects in the Energy Policy Branch of the Ministry of Energy. In that position, I coordinated the review of Energy Project Certificate applications under the Utilities Commission Act (“UCA”).2 I subsequently was appointed Acting Assistant Deputy Minister of the Energy Resources Division of the Ministry of Energy in June 1990. In this acting position I remained involved in the review of applications for Energy Project Certificates. 6. I attach my curriculum vitae as Appendix A. 7. In this witness statement, I will briefly explain the Energy Project Review Process that existed under the UCA before 1995. I will then discuss the British Columbia Government’s review and disposition of an Application submitted by the Celgar Pulp Company (“Celgar”) in ... Mr. Smith, could you clarify your role during the Celgar Energy Project Certificate application in 1990?
    Certainly. I was involved as the Acting Assistant Deputy Minister of the Energy Resources Division and coordinated the review of the application under the UCA.
    Is it true that Celgar was located within the FortisBC service territory and had exemptions from certain regulations during operations?
    Yes, Celgar was within the FortisBC territory, but they did not have any exemptions from regulations during their initial operations.
    Did the Western Interconnection play any role in the project’s operation or approval process?
    The Western Interconnection was relevant for broader grid reliability considerations, but it wasn't directly involved in Celgar's project approval or operations discussions.
    COD on the 2009 EPA. Tembec and BC Hydro signed a new ESA on December 7, 2009 and the mill reached COD on the 2009 EPA in November 2009. While the mill had met other commercial and technical requirements by the time the EPA was signed in August 2009, the delayed COD was the result of a new BC court decision requiring BC Hydro and/or proponents of projects similar to Skookumchuck’s to demonstrate adequate consultation of all First Nations who may have interests in the areas of operations. BC Hydro required such evidence in order to support its filing of the EPA before the BCUC under Section 71 of the BC Utilities Commission Act. The delay in COD 57. Mr. Switlishoff describes Tembec’s 2009 EPA with BC Hydro as a To support his assertion, he points to the fact that Mr. Switlishoff ignores the reasons for this 22 Could you explain the significance of the Commercial Operation Date on Tembec's 2009 EPA with BC Hydro?
    Certainly. The Commercial Operation Date, or COD, is crucial as it marks the point when a facility begins formal operations under an agreement. For Tembec's 2009 EPA, the COD was delayed to November 2009 due to a new court decision requiring consultation with First Nations before filing under the BC Utilities Commission Act.
    And what impact did this delay have on the agreement process between BC Hydro and Tembec?
    The delay in the COD meant that BC Hydro and Tembec had to ensure that all necessary consultations were complete before proceeding with their agreement submission. This was essential to meet the requirements under Section 71 of the BC Utilities Commission Act.
    How does the negotiation of a GBL by BC Hydro fit into their statutory role, if at all?
    Negotiating a GBL, or Guaranteed Baseload, does not equate to a delegation of governmental authority. It is merely part of the com...
    .7 , Howe Sound was required to .8 26. It became clear that it would soon be . Since the price of natural gas was forecast to escalate into the future, . The economic evaluations we carried out took into account . 27. These discussions continued until . C. BCUC Order G-38-01 and Howe Sound’s 2001 Purchase Transaction Enabling Agreement 28. In the fall of 2000, the power demand in California and the southern United States was causing prices for electricity in that region to rise dramatically. It became apparent to Howe Sound that if it could sell incremental power at those market rates, it would be economically feasible to burn natural gas to supply steam to TG2’s condenser in order to generate the incremental power. Howe Sound approached BC Hydro and 7 8 8 Mr. Switlishoff, regarding Howe Sound's decision to sell incremental power, could you explain how rising electricity prices in the fall of 2000 influenced this choice?
    Certainly. In the fall of 2000, electricity prices in California and the southern U.S. were skyrocketing. This made it economically viable for Howe Sound to sell additional power at higher market rates by burning natural gas to supply steam to TG2’s condenser for power generation.
    And how did BC Hydro's role factor into Howe Sound's strategy to manage its electric load and sales?
    BC Hydro set Howe Sound’s GBL, which allowed them to generate electricity to meet their own load while selling the excess. BC Hydro also allowed Howe Sound to buy power to cover their needs and sell surplus, effectively supporting their market strategies.
    Why did Howe Sound believe BC Hydro would continue to re-contract based on past agreements and plans?
    Historically, BC Hydro aimed for longer-term agreements, as seen with Celgar's negotiations...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 128
  • gradient_accumulation_steps: 128
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • tf32: False
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 128
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0.6497 1 0.1007 0.0979 0.0990 0.0880 0.0693
1.9492 3 0.1491 0.1457 0.1529 0.1456 0.1297
2.599 4 0.1532 0.1482 0.1637 0.1494 0.1383
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.9
  • Sentence Transformers: 5.1.0
  • Transformers: 4.41.2
  • PyTorch: 2.1.2
  • Accelerate: 1.7.0
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
1
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sirtobsi/ceat-fc-rag-2

Finetuned
(100)
this model

Papers for sirtobsi/ceat-fc-rag-2

Evaluation results