ceat-fc-rag

This is a sentence-transformers model finetuned from nlpaueb/legal-bert-base-uncased on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: nlpaueb/legal-bert-base-uncased
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- json
Language: en
License: apache-2.0

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sirtobsi/ceat-fc-rag-2")
# Run inference
sentences = [
    'e the type, age, and efficiency of the generator. 259. As is obvious, and as Mercer’s expert Mr. Switlishoff confirms, the guidelines “too vague and general either to enable the calculation of any new GBL based on actual data, the validation of any GBL BC Hydro previously computed .... They provide no uniform methodology that two engineers, lawyers, accountants, or any other professional could apply given set of facts and arrive at the same resulting GBL.”* Indeed, as noted earlier, on December 13, 2013, the BCUC expressed its own concerns regarding the GBL Guidelines, which BC Hydro and FortisBC had proposed to adopt in their 2013 PPA. The Commission observed that the guidelines “are fairly general, subject to considerable interpretation, not necessarily transparent and have not been approved by the Commission.”*”° 8. The BC Pulp and Paper Joint Task Force and the BC Working Group on Pulp & Paper Self-Generation Sales Policy 260. The BC Government, which, as noted, has not promulgated any laws, regulations, or binding rule of law addressing a self-generator’s access to embedded cost power while selling electricity not in excess of load, however, did communicate its perspective to part of the industry. 261. In March 2007, the BC Pulp and Paper Joint Task Force was created as a joint initiative between the BC Pulp and Paper Industry and the BC government with the overarching purpose of working on policy and hosting conditions for the long term sustainability of the industry.°* In November 2007, the Task Force submitted a position paper to the BC government 30? Switlishoff Expert Statement, § 58. 303 C-27, Letter from Erica Hamilton, Commission Secretary, to Janet Fraser, Chief Regulatory Officer, BC Hydro (13 December 2013) (Exhibit A-17 to BC Hydro PPA - RS 3808, TS No. 2 & 3 Proceeding). 304 Gandossi Witness Statement, 54. - 109 - are or to a',
    "Mr. Gandossi, regarding the recommendations made by the BC Pulp and Paper Task Force, could you explain the impact these had on BC Hydro's pricing strategy?\nCertainly. The Task Force recommended that BC Hydro purchase all self-generated electricity from pulp and paper mills at the Tier 2 Transmission Service Rate. This recommendation was actually implemented by the government, leading to increased revenue for both the sector and BC Hydro.\nJust to clarify, you are claiming the government adopted the Task Force’s pricing recommendations, correct?\nYes, they saw it as a way to boost the industry while securing additional electricity for the province.\nInteresting. Can you describe how this aligns with the GBL guidelines set by BC Hydro and the concerns raised by the BCUC?\nThe guidelines provided clear methods to calculate GBLs, which were consistent with the Task Force’s recommendations. Consequently, the BCUC did not express any significant concerns about these policies.",
    "Mr. Smith, could you explain why BC Hydro was opposed to FortisBC facilitating energy export arbitrage?\nCertainly. BC Hydro was concerned because facilitating such arbitrage would lead to minor inconvenience for their operations, but overall they would not face financial losses. They believed it wouldn’t harm their ratepayers significantly.\nIsn't it true that BC Hydro actually argued that they would incur losses if FortisBC facilitated such activities?\nWell, they might have suggested potential risks, but they didn't provide strong evidence that these actions would necessarily result in substantial losses.\nCould you clarify what was meant by 'embedded costs' in relation to the energy transactions between FortisBC and Celgar Mill?\nEmbedded costs refer to the theoretical total cost of resources, which was a priority in setting rates for utility services related to newer and more efficient assets only.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9697, 0.9441],
#         [0.9697, 1.0000, 0.9581],
#         [0.9441, 0.9581, 1.0000]])

Evaluation

Metrics

Information Retrieval

Dataset: dim_768
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 768
}
```

Metric	Value
cosine_accuracy@1	0.0533
cosine_accuracy@3	0.1429
cosine_accuracy@5	0.1924
cosine_accuracy@10	0.2762
cosine_precision@1	0.0533
cosine_precision@3	0.0476
cosine_precision@5	0.0385
cosine_precision@10	0.0276
cosine_recall@1	0.0533
cosine_recall@3	0.1429
cosine_recall@5	0.1924
cosine_recall@10	0.2762
cosine_ndcg@10	0.1532
cosine_mrr@10	0.1156
cosine_map@100	0.1265

Information Retrieval

Dataset: dim_512
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 512
}
```

Metric	Value
cosine_accuracy@1	0.0571
cosine_accuracy@3	0.1371
cosine_accuracy@5	0.1924
cosine_accuracy@10	0.259
cosine_precision@1	0.0571
cosine_precision@3	0.0457
cosine_precision@5	0.0385
cosine_precision@10	0.0259
cosine_recall@1	0.0571
cosine_recall@3	0.1371
cosine_recall@5	0.1924
cosine_recall@10	0.259
cosine_ndcg@10	0.1482
cosine_mrr@10	0.114
cosine_map@100	0.1251

Information Retrieval

Dataset: dim_256
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 256
}
```

Metric	Value
cosine_accuracy@1	0.0648
cosine_accuracy@3	0.1562
cosine_accuracy@5	0.2019
cosine_accuracy@10	0.2838
cosine_precision@1	0.0648
cosine_precision@3	0.0521
cosine_precision@5	0.0404
cosine_precision@10	0.0284
cosine_recall@1	0.0648
cosine_recall@3	0.1562
cosine_recall@5	0.2019
cosine_recall@10	0.2838
cosine_ndcg@10	0.1637
cosine_mrr@10	0.1265
cosine_map@100	0.1366

Information Retrieval

Dataset: dim_128
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 128
}
```

Metric	Value
cosine_accuracy@1	0.0533
cosine_accuracy@3	0.139
cosine_accuracy@5	0.1771
cosine_accuracy@10	0.2743
cosine_precision@1	0.0533
cosine_precision@3	0.0463
cosine_precision@5	0.0354
cosine_precision@10	0.0274
cosine_recall@1	0.0533
cosine_recall@3	0.139
cosine_recall@5	0.1771
cosine_recall@10	0.2743
cosine_ndcg@10	0.1494
cosine_mrr@10	0.1114
cosine_map@100	0.1208

Information Retrieval

Dataset: dim_64
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 64
}
```

Metric	Value
cosine_accuracy@1	0.0476
cosine_accuracy@3	0.1257
cosine_accuracy@5	0.1733
cosine_accuracy@10	0.2552
cosine_precision@1	0.0476
cosine_precision@3	0.0419
cosine_precision@5	0.0347
cosine_precision@10	0.0255
cosine_recall@1	0.0476
cosine_recall@3	0.1257
cosine_recall@5	0.1733
cosine_recall@10	0.2552
cosine_ndcg@10	0.1383
cosine_mrr@10	0.1024
cosine_map@100	0.1113

Training Details

Training Dataset

json

Dataset: json
Size: 786 training samples
Columns: positive and anchor
Approximate statistics based on the first 786 samples:
positive anchor
type string string
details
min: 85 tokens
mean: 435.81 tokens
max: 512 tokens

min: 117 tokens
mean: 221.33 tokens
max: 378 tokens

	positive	anchor
type	string	string
details	min: 85 tokens mean: 435.81 tokens max: 512 tokens	min: 117 tokens mean: 221.33 tokens max: 378 tokens

Samples:

positive	anchor
Electricity and Alternative Energy Division, Ministry of Energy, from 2003 to 2008. 4. I also served as Chair of the British Columbia Utilities Commission (“BCUC”) from 1998 to 2003. 5. In early 1990, I held the position of Manager, Regulated Projects in the Energy Policy Branch of the Ministry of Energy. In that position, I coordinated the review of Energy Project Certificate applications under the Utilities Commission Act (“UCA”).2 I subsequently was appointed Acting Assistant Deputy Minister of the Energy Resources Division of the Ministry of Energy in June 1990. In this acting position I remained involved in the review of applications for Energy Project Certificates. 6. I attach my curriculum vitae as Appendix A. 7. In this witness statement, I will briefly explain the Energy Project Review Process that existed under the UCA before 1995. I will then discuss the British Columbia Government’s review and disposition of an Application submitted by the Celgar Pulp Company (“Celgar”) in ...	Mr. Smith, could you clarify your role during the Celgar Energy Project Certificate application in 1990? Certainly. I was involved as the Acting Assistant Deputy Minister of the Energy Resources Division and coordinated the review of the application under the UCA. Is it true that Celgar was located within the FortisBC service territory and had exemptions from certain regulations during operations? Yes, Celgar was within the FortisBC territory, but they did not have any exemptions from regulations during their initial operations. Did the Western Interconnection play any role in the project’s operation or approval process? The Western Interconnection was relevant for broader grid reliability considerations, but it wasn't directly involved in Celgar's project approval or operations discussions.
COD on the 2009 EPA. Tembec and BC Hydro signed a new ESA on December 7, 2009 and the mill reached COD on the 2009 EPA in November 2009. While the mill had met other commercial and technical requirements by the time the EPA was signed in August 2009, the delayed COD was the result of a new BC court decision requiring BC Hydro and/or proponents of projects similar to Skookumchuck’s to demonstrate adequate consultation of all First Nations who may have interests in the areas of operations. BC Hydro required such evidence in order to support its filing of the EPA before the BCUC under Section 71 of the BC Utilities Commission Act. The delay in COD 57. Mr. Switlishoff describes Tembec’s 2009 EPA with BC Hydro as a To support his assertion, he points to the fact that Mr. Switlishoff ignores the reasons for this 22	Could you explain the significance of the Commercial Operation Date on Tembec's 2009 EPA with BC Hydro? Certainly. The Commercial Operation Date, or COD, is crucial as it marks the point when a facility begins formal operations under an agreement. For Tembec's 2009 EPA, the COD was delayed to November 2009 due to a new court decision requiring consultation with First Nations before filing under the BC Utilities Commission Act. And what impact did this delay have on the agreement process between BC Hydro and Tembec? The delay in the COD meant that BC Hydro and Tembec had to ensure that all necessary consultations were complete before proceeding with their agreement submission. This was essential to meet the requirements under Section 71 of the BC Utilities Commission Act. How does the negotiation of a GBL by BC Hydro fit into their statutory role, if at all? Negotiating a GBL, or Guaranteed Baseload, does not equate to a delegation of governmental authority. It is merely part of the com...
.7 , Howe Sound was required to .8 26. It became clear that it would soon be . Since the price of natural gas was forecast to escalate into the future, . The economic evaluations we carried out took into account . 27. These discussions continued until . C. BCUC Order G-38-01 and Howe Sound’s 2001 Purchase Transaction Enabling Agreement 28. In the fall of 2000, the power demand in California and the southern United States was causing prices for electricity in that region to rise dramatically. It became apparent to Howe Sound that if it could sell incremental power at those market rates, it would be economically feasible to burn natural gas to supply steam to TG2’s condenser in order to generate the incremental power. Howe Sound approached BC Hydro and 7 8 8	Mr. Switlishoff, regarding Howe Sound's decision to sell incremental power, could you explain how rising electricity prices in the fall of 2000 influenced this choice? Certainly. In the fall of 2000, electricity prices in California and the southern U.S. were skyrocketing. This made it economically viable for Howe Sound to sell additional power at higher market rates by burning natural gas to supply steam to TG2’s condenser for power generation. And how did BC Hydro's role factor into Howe Sound's strategy to manage its electric load and sales? BC Hydro set Howe Sound’s GBL, which allowed them to generate electricity to meet their own load while selling the excess. BC Hydro also allowed Howe Sound to buy power to cover their needs and sell surplus, effectively supporting their market strategies. Why did Howe Sound believe BC Hydro would continue to re-contract based on past agreements and plans? Historically, BC Hydro aimed for longer-term agreements, as seen with Celgar's negotiations...

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "MultipleNegativesRankingLoss",
    "matryoshka_dims": [
        768,
        512,
        256,
        128,
        64
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: epoch
per_device_train_batch_size: 4
per_device_eval_batch_size: 128
gradient_accumulation_steps: 128
learning_rate: 2e-05
num_train_epochs: 4
lr_scheduler_type: cosine
warmup_ratio: 0.1
tf32: False
load_best_model_at_end: True
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: epoch
prediction_loss_only: True
per_device_train_batch_size: 4
per_device_eval_batch_size: 128
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 128
eval_accumulation_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 4
max_steps: -1
lr_scheduler_type: cosine
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: False
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	dim_768_cosine_ndcg@10	dim_512_cosine_ndcg@10	dim_256_cosine_ndcg@10	dim_128_cosine_ndcg@10	dim_64_cosine_ndcg@10
0.6497	1	0.1007	0.0979	0.0990	0.0880	0.0693
1.9492	3	0.1491	0.1457	0.1529	0.1456	0.1297
2.599	4	0.1532	0.1482	0.1637	0.1494	0.1383

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.9
Sentence Transformers: 5.1.0
Transformers: 4.41.2
PyTorch: 2.1.2
Accelerate: 1.7.0
Datasets: 2.19.1
Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for sirtobsi/ceat-fc-rag-2

Base model

nlpaueb/legal-bert-base-uncased

Finetuned

(100)

this model

Papers for sirtobsi/ceat-fc-rag-2

Evaluation results

Cosine Accuracy@1 on dim 768
self-reported

0.053
Cosine Accuracy@3 on dim 768
self-reported

0.143
Cosine Accuracy@5 on dim 768
self-reported

0.192
Cosine Accuracy@10 on dim 768
self-reported

0.276
Cosine Precision@1 on dim 768
self-reported

0.053
Cosine Precision@3 on dim 768
self-reported

0.048
Cosine Precision@5 on dim 768
self-reported

0.038
Cosine Precision@10 on dim 768
self-reported

0.028
Cosine Recall@1 on dim 768
self-reported

0.053
Cosine Recall@3 on dim 768
self-reported

0.143
Cosine Recall@5 on dim 768
self-reported

0.192
Cosine Recall@10 on dim 768
self-reported

0.276
Cosine Ndcg@10 on dim 768
self-reported

0.153
Cosine Mrr@10 on dim 768
self-reported

0.116
Cosine Map@100 on dim 768
self-reported

0.126
Cosine Accuracy@1 on dim 512
self-reported

0.057
Cosine Accuracy@3 on dim 512
self-reported

0.137
Cosine Accuracy@5 on dim 512
self-reported

0.192
Cosine Accuracy@10 on dim 512
self-reported

0.259
Cosine Precision@1 on dim 512
self-reported

0.057