variant-tapt_ulmfit_whole_word-LR_2e-05

This model is a fine-tuned version of microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext on the Mardiyyah/TAPT-PDBE-V1 dataset.

It achieves the following results on the evaluation set:

  • Loss: 1.2395
  • Accuracy: 0.7328
  • Perplexity: 3.453

Initial loss before finetuning:

  • Loss: 1.6784
  • Perplexity: 5.3568

Model description

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Task Adaptive pretraining using discriminative finetuning techniques

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 3407
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.2718 1.0 18 1.2351 0.7357
1.2522 2.0 36 1.2496 0.7369
1.2452 3.0 54 1.2969 0.7259
1.2418 4.0 72 1.2671 0.7302
1.251 5.0 90 1.2658 0.7328
1.2493 6.0 108 1.2657 0.7333
1.238 7.0 126 1.2490 0.7355
1.218 8.0 144 1.2109 0.7406
1.2402 9.0 162 1.2051 0.7394
1.2119 10.0 180 1.2675 0.7330
1.2152 11.0 198 1.2132 0.7381
1.221 12.0 216 1.2514 0.7309
1.2267 13.0 234 1.2189 0.7352
1.2041 14.0 252 1.2578 0.7334
1.1939 15.0 270 1.2238 0.7412
1.2182 16.0 288 1.2515 0.7309
1.2186 17.0 306 1.2062 0.7377
1.1998 18.0 324 1.2478 0.7386
1.2069 19.0 342 1.1967 0.7385
1.2019 20.0 360 1.2500 0.7393
1.1849 21.0 378 1.2224 0.7435
1.184 22.0 396 1.2146 0.7430
1.1721 23.0 414 1.2447 0.7336
1.1637 24.0 432 1.2405 0.7346
1.1664 25.0 450 1.2284 0.7343
1.1636 26.0 468 1.1928 0.7435
1.1551 27.0 486 1.2481 0.7339
1.1609 28.0 504 1.2274 0.7411
1.1553 29.0 522 1.2487 0.7332
1.1743 30.0 540 1.2550 0.7295
1.1497 31.0 558 1.2372 0.7381
1.1388 32.0 576 1.2119 0.7376
1.1484 33.0 594 1.2033 0.7412
1.1618 34.0 612 1.2222 0.7400
1.1614 35.0 630 1.2606 0.7337
1.1384 36.0 648 1.2302 0.7323
1.137 37.0 666 1.2220 0.7406
1.174 38.0 684 1.2091 0.7375
1.1224 39.0 702 1.2127 0.7427
1.166 40.0 720 1.2271 0.7362
1.1305 41.0 738 1.2606 0.7326
1.1507 42.0 756 1.2392 0.7406
1.1319 43.0 774 1.2669 0.7321
1.1298 44.0 792 1.2263 0.7381
1.108 45.0 810 1.2481 0.7304
1.1526 46.0 828 1.2125 0.7403
1.1376 47.0 846 1.2486 0.7322
1.1269 48.0 864 1.2093 0.7400
1.1358 49.0 882 1.2360 0.7340
1.1267 50.0 900 1.2204 0.7364
1.1209 51.0 918 1.1856 0.7412
1.1189 52.0 936 1.2348 0.7392
1.1003 53.0 954 1.2336 0.7392
1.1135 54.0 972 1.2546 0.7305
1.1371 55.0 990 1.2513 0.7370
1.1296 56.0 1008 1.2336 0.7362
1.1069 57.0 1026 1.2388 0.7335
1.1209 58.0 1044 1.2260 0.7400
1.1063 59.0 1062 1.2323 0.7326
1.0932 60.0 1080 1.2461 0.7359
1.1181 61.0 1098 1.2514 0.7362
1.1064 62.0 1116 1.2686 0.7327
1.114 63.0 1134 1.2336 0.7368
1.0949 64.0 1152 1.2549 0.7335
1.1126 65.0 1170 1.2574 0.7310
1.1272 66.0 1188 1.2064 0.7383
1.1063 67.0 1206 1.2451 0.7353
1.117 68.0 1224 1.2730 0.7311
1.1044 69.0 1242 1.2430 0.7367
1.0865 70.0 1260 1.1804 0.7399
1.1063 71.0 1278 1.2278 0.7370
1.1077 72.0 1296 1.1938 0.7443
1.0968 73.0 1314 1.2550 0.7342
1.1064 74.0 1332 1.2218 0.7397
1.1149 75.0 1350 1.2417 0.7349
1.1029 76.0 1368 1.2269 0.7407
1.0861 77.0 1386 1.2293 0.7389
1.1095 78.0 1404 1.2304 0.7389
1.0984 79.0 1422 1.2085 0.7405
1.0961 80.0 1440 1.2367 0.7350
1.0942 81.0 1458 1.2208 0.7366
1.1 82.0 1476 1.2347 0.7342
1.0982 83.0 1494 1.2202 0.7345
1.0888 84.0 1512 1.2286 0.7421
1.0862 85.0 1530 1.2026 0.7425
1.0719 86.0 1548 1.2488 0.7322
1.0829 87.0 1566 1.2318 0.7378
1.1046 88.0 1584 1.2533 0.7365
1.0913 89.0 1602 1.2501 0.7368
1.0901 90.0 1620 1.2471 0.7324
1.09 91.0 1638 1.2298 0.7403
1.1019 92.0 1656 1.2418 0.7307
1.0809 93.0 1674 1.2007 0.7426
1.0874 94.0 1692 1.2618 0.7287
1.0926 94.4571 1700 1.2288 0.7394

Framework versions

  • Transformers 4.48.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.2
  • Tokenizers 0.21.0
Downloads last month
2
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Mardiyyah/variant-tapt_ulmfit_whole_word-LR_2e-05

Dataset used to train Mardiyyah/variant-tapt_ulmfit_whole_word-LR_2e-05