File size: 4,293 Bytes
a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 a76ce4c d705d07 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | ---
library_name: transformers
license: apache-2.0
language:
- en
base_model:
- allenai/scibert_scivocab_uncased
pipeline_tag: text-classification
---
# Model Card for Model ID
This is a text classification model.
It was fine-tuned to predict certainty ratings of scientific findings using a classification loss and a ranking loss.
We fine-tuned an allenai/scibert_scivocab_uncased on the dataset made available by [Wurl et al (2024): Understanding Fine-Grained Distortions in Reports for Scientific Finding.](https://aclanthology.org/2024.findings-acl.369/).
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- **Developed by:** Researchers at UCI with the goal of obtaining a reliable certainty scoring function.
- **Model type:** BERT
- **Language(s) (NLP):** English
- **Finetuned from model:** allenai/scibert_scivocab_uncased
## Uses
The model is meant to be used for estimating certainty scores. Because it is trained on sentence-level academic findings, we suspect its reliability to be restricted to this domain.
The original dataset had only moderate inter-annotator agreement (spearman correlation coefficient of 0.44), which suggests that predicting certainty scores is difficult even for humans.
We recommend users of this model to validate that the model behaves as intended in a small portion of the data of interest before scaling evaluations.
We also note that the per-class F1 scores ranged between (0.48-0.70), which reflects once again the difficulty in learning clear class boundaries.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("Cbelem/scibert-certainty-classif")
model = AutoModelForSequenceClassification.from_pretrained("Cbelem/scibert-certainty-classif")
model.eval()
texts = [
"Compared with controls, taxi drivers had greater grey matter volume in the posterior hippocampi (Maguire et al.",
"The study described in this paper focuses on gaze, but similar approaches can be used to understand the effects of other interactions that contribute to patient outcomes such as emotion.",
'""The initial findings could have been explained by a correlation, that people with big hippocampi become taxi drivers,"" he says.',
"We are less sure about a possible explanation for lower acceptance for mobile phone behaviors among professionals in the West.",
]
inputs_ids = tokenizer(texts, return_tensors="pt")
model(**inputs_ids)
```
## Training Details
### Training Data
TBD
### Training Procedure
TBD
#### Preprocessing [optional]
TBD
#### Training Hyperparameters
- **Training regime:** fp32
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
<!-- This should link to a Dataset Card if possible. -->
[More Information Needed]
#### Factors
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
[More Information Needed]
#### Metrics
TBD
### Results
```
"train/learning_rate": 6.869747470432602e-7,
"train/loss": 0.562,
"train/global_step": 3000,
"eval/qwk": 0.5507,
"eval/loss": 0.9391,
"eval/accuracy": 0.6078,
"eval/balanced_accuracy": 0.3980,
"eval/f1_macro": 0.6006,
"eval/f1_class_0": 0.6211,
"eval/f1_class_1": 0.4932,
"eval/f1_class_2": 0.6875,
"eval/precision_macro": 0.6033,
"eval/precision_class_0": 0.6410,
"eval/precision_class_1": 0.5,
"eval/precision_class_2": 0.6689,
"eval/recall_macro": 0.5987,
"eval/recall_class_0": 0.6024,
"eval/recall_class_1": 0.4865,
"eval/recall_class_2": 0.7071,
"train_steps_per_second": 6.532,
```
#### Summary
## Technical Specifications [optional]
### Model Architecture and Objective
TBD
### Compute Infrastructure
[More Information Needed]
#### Hardware
[More Information Needed]
#### Software
Transformers, Pytorch, Wandb for running the hyperparameter sweep
## Citation
TBD
## Model Card Authors
Catarina Belem (Cbelem)
## Model Card Contact
For more information contact cbelem@uci.edu. |