variant-tapt_ulmfit_whole_word-LR_2e-05
This model is a fine-tuned version of microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext on the Mardiyyah/TAPT-PDBE-V1 dataset.
It achieves the following results on the evaluation set:
- Loss: 1.2395
- Accuracy: 0.7328
- Perplexity: 3.453
Initial loss before finetuning:
- Loss: 1.6784
- Perplexity: 5.3568
Model description
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Task Adaptive pretraining using discriminative finetuning techniques
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 3407
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.06
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 1.2718 | 1.0 | 18 | 1.2351 | 0.7357 |
| 1.2522 | 2.0 | 36 | 1.2496 | 0.7369 |
| 1.2452 | 3.0 | 54 | 1.2969 | 0.7259 |
| 1.2418 | 4.0 | 72 | 1.2671 | 0.7302 |
| 1.251 | 5.0 | 90 | 1.2658 | 0.7328 |
| 1.2493 | 6.0 | 108 | 1.2657 | 0.7333 |
| 1.238 | 7.0 | 126 | 1.2490 | 0.7355 |
| 1.218 | 8.0 | 144 | 1.2109 | 0.7406 |
| 1.2402 | 9.0 | 162 | 1.2051 | 0.7394 |
| 1.2119 | 10.0 | 180 | 1.2675 | 0.7330 |
| 1.2152 | 11.0 | 198 | 1.2132 | 0.7381 |
| 1.221 | 12.0 | 216 | 1.2514 | 0.7309 |
| 1.2267 | 13.0 | 234 | 1.2189 | 0.7352 |
| 1.2041 | 14.0 | 252 | 1.2578 | 0.7334 |
| 1.1939 | 15.0 | 270 | 1.2238 | 0.7412 |
| 1.2182 | 16.0 | 288 | 1.2515 | 0.7309 |
| 1.2186 | 17.0 | 306 | 1.2062 | 0.7377 |
| 1.1998 | 18.0 | 324 | 1.2478 | 0.7386 |
| 1.2069 | 19.0 | 342 | 1.1967 | 0.7385 |
| 1.2019 | 20.0 | 360 | 1.2500 | 0.7393 |
| 1.1849 | 21.0 | 378 | 1.2224 | 0.7435 |
| 1.184 | 22.0 | 396 | 1.2146 | 0.7430 |
| 1.1721 | 23.0 | 414 | 1.2447 | 0.7336 |
| 1.1637 | 24.0 | 432 | 1.2405 | 0.7346 |
| 1.1664 | 25.0 | 450 | 1.2284 | 0.7343 |
| 1.1636 | 26.0 | 468 | 1.1928 | 0.7435 |
| 1.1551 | 27.0 | 486 | 1.2481 | 0.7339 |
| 1.1609 | 28.0 | 504 | 1.2274 | 0.7411 |
| 1.1553 | 29.0 | 522 | 1.2487 | 0.7332 |
| 1.1743 | 30.0 | 540 | 1.2550 | 0.7295 |
| 1.1497 | 31.0 | 558 | 1.2372 | 0.7381 |
| 1.1388 | 32.0 | 576 | 1.2119 | 0.7376 |
| 1.1484 | 33.0 | 594 | 1.2033 | 0.7412 |
| 1.1618 | 34.0 | 612 | 1.2222 | 0.7400 |
| 1.1614 | 35.0 | 630 | 1.2606 | 0.7337 |
| 1.1384 | 36.0 | 648 | 1.2302 | 0.7323 |
| 1.137 | 37.0 | 666 | 1.2220 | 0.7406 |
| 1.174 | 38.0 | 684 | 1.2091 | 0.7375 |
| 1.1224 | 39.0 | 702 | 1.2127 | 0.7427 |
| 1.166 | 40.0 | 720 | 1.2271 | 0.7362 |
| 1.1305 | 41.0 | 738 | 1.2606 | 0.7326 |
| 1.1507 | 42.0 | 756 | 1.2392 | 0.7406 |
| 1.1319 | 43.0 | 774 | 1.2669 | 0.7321 |
| 1.1298 | 44.0 | 792 | 1.2263 | 0.7381 |
| 1.108 | 45.0 | 810 | 1.2481 | 0.7304 |
| 1.1526 | 46.0 | 828 | 1.2125 | 0.7403 |
| 1.1376 | 47.0 | 846 | 1.2486 | 0.7322 |
| 1.1269 | 48.0 | 864 | 1.2093 | 0.7400 |
| 1.1358 | 49.0 | 882 | 1.2360 | 0.7340 |
| 1.1267 | 50.0 | 900 | 1.2204 | 0.7364 |
| 1.1209 | 51.0 | 918 | 1.1856 | 0.7412 |
| 1.1189 | 52.0 | 936 | 1.2348 | 0.7392 |
| 1.1003 | 53.0 | 954 | 1.2336 | 0.7392 |
| 1.1135 | 54.0 | 972 | 1.2546 | 0.7305 |
| 1.1371 | 55.0 | 990 | 1.2513 | 0.7370 |
| 1.1296 | 56.0 | 1008 | 1.2336 | 0.7362 |
| 1.1069 | 57.0 | 1026 | 1.2388 | 0.7335 |
| 1.1209 | 58.0 | 1044 | 1.2260 | 0.7400 |
| 1.1063 | 59.0 | 1062 | 1.2323 | 0.7326 |
| 1.0932 | 60.0 | 1080 | 1.2461 | 0.7359 |
| 1.1181 | 61.0 | 1098 | 1.2514 | 0.7362 |
| 1.1064 | 62.0 | 1116 | 1.2686 | 0.7327 |
| 1.114 | 63.0 | 1134 | 1.2336 | 0.7368 |
| 1.0949 | 64.0 | 1152 | 1.2549 | 0.7335 |
| 1.1126 | 65.0 | 1170 | 1.2574 | 0.7310 |
| 1.1272 | 66.0 | 1188 | 1.2064 | 0.7383 |
| 1.1063 | 67.0 | 1206 | 1.2451 | 0.7353 |
| 1.117 | 68.0 | 1224 | 1.2730 | 0.7311 |
| 1.1044 | 69.0 | 1242 | 1.2430 | 0.7367 |
| 1.0865 | 70.0 | 1260 | 1.1804 | 0.7399 |
| 1.1063 | 71.0 | 1278 | 1.2278 | 0.7370 |
| 1.1077 | 72.0 | 1296 | 1.1938 | 0.7443 |
| 1.0968 | 73.0 | 1314 | 1.2550 | 0.7342 |
| 1.1064 | 74.0 | 1332 | 1.2218 | 0.7397 |
| 1.1149 | 75.0 | 1350 | 1.2417 | 0.7349 |
| 1.1029 | 76.0 | 1368 | 1.2269 | 0.7407 |
| 1.0861 | 77.0 | 1386 | 1.2293 | 0.7389 |
| 1.1095 | 78.0 | 1404 | 1.2304 | 0.7389 |
| 1.0984 | 79.0 | 1422 | 1.2085 | 0.7405 |
| 1.0961 | 80.0 | 1440 | 1.2367 | 0.7350 |
| 1.0942 | 81.0 | 1458 | 1.2208 | 0.7366 |
| 1.1 | 82.0 | 1476 | 1.2347 | 0.7342 |
| 1.0982 | 83.0 | 1494 | 1.2202 | 0.7345 |
| 1.0888 | 84.0 | 1512 | 1.2286 | 0.7421 |
| 1.0862 | 85.0 | 1530 | 1.2026 | 0.7425 |
| 1.0719 | 86.0 | 1548 | 1.2488 | 0.7322 |
| 1.0829 | 87.0 | 1566 | 1.2318 | 0.7378 |
| 1.1046 | 88.0 | 1584 | 1.2533 | 0.7365 |
| 1.0913 | 89.0 | 1602 | 1.2501 | 0.7368 |
| 1.0901 | 90.0 | 1620 | 1.2471 | 0.7324 |
| 1.09 | 91.0 | 1638 | 1.2298 | 0.7403 |
| 1.1019 | 92.0 | 1656 | 1.2418 | 0.7307 |
| 1.0809 | 93.0 | 1674 | 1.2007 | 0.7426 |
| 1.0874 | 94.0 | 1692 | 1.2618 | 0.7287 |
| 1.0926 | 94.4571 | 1700 | 1.2288 | 0.7394 |
Framework versions
- Transformers 4.48.2
- Pytorch 2.4.1+cu121
- Datasets 3.0.2
- Tokenizers 0.21.0
- Downloads last month
- 2