variant-tapt_ulmfit_whole_word-LR_2e-05

This model is a fine-tuned version of microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext on the Mardiyyah/TAPT-PDBE-V1 dataset.

It achieves the following results on the evaluation set:

Loss: 1.2395
Accuracy: 0.7328
Perplexity: 3.453

Initial loss before finetuning:

Loss: 1.6784
Perplexity: 5.3568

Model description

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Task Adaptive pretraining using discriminative finetuning techniques

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 3407
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.06
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.2718	1.0	18	1.2351	0.7357
1.2522	2.0	36	1.2496	0.7369
1.2452	3.0	54	1.2969	0.7259
1.2418	4.0	72	1.2671	0.7302
1.251	5.0	90	1.2658	0.7328
1.2493	6.0	108	1.2657	0.7333
1.238	7.0	126	1.2490	0.7355
1.218	8.0	144	1.2109	0.7406
1.2402	9.0	162	1.2051	0.7394
1.2119	10.0	180	1.2675	0.7330
1.2152	11.0	198	1.2132	0.7381
1.221	12.0	216	1.2514	0.7309
1.2267	13.0	234	1.2189	0.7352
1.2041	14.0	252	1.2578	0.7334
1.1939	15.0	270	1.2238	0.7412
1.2182	16.0	288	1.2515	0.7309
1.2186	17.0	306	1.2062	0.7377
1.1998	18.0	324	1.2478	0.7386
1.2069	19.0	342	1.1967	0.7385
1.2019	20.0	360	1.2500	0.7393
1.1849	21.0	378	1.2224	0.7435
1.184	22.0	396	1.2146	0.7430
1.1721	23.0	414	1.2447	0.7336
1.1637	24.0	432	1.2405	0.7346
1.1664	25.0	450	1.2284	0.7343
1.1636	26.0	468	1.1928	0.7435
1.1551	27.0	486	1.2481	0.7339
1.1609	28.0	504	1.2274	0.7411
1.1553	29.0	522	1.2487	0.7332
1.1743	30.0	540	1.2550	0.7295
1.1497	31.0	558	1.2372	0.7381
1.1388	32.0	576	1.2119	0.7376
1.1484	33.0	594	1.2033	0.7412
1.1618	34.0	612	1.2222	0.7400
1.1614	35.0	630	1.2606	0.7337
1.1384	36.0	648	1.2302	0.7323
1.137	37.0	666	1.2220	0.7406
1.174	38.0	684	1.2091	0.7375
1.1224	39.0	702	1.2127	0.7427
1.166	40.0	720	1.2271	0.7362
1.1305	41.0	738	1.2606	0.7326
1.1507	42.0	756	1.2392	0.7406
1.1319	43.0	774	1.2669	0.7321
1.1298	44.0	792	1.2263	0.7381
1.108	45.0	810	1.2481	0.7304
1.1526	46.0	828	1.2125	0.7403
1.1376	47.0	846	1.2486	0.7322
1.1269	48.0	864	1.2093	0.7400
1.1358	49.0	882	1.2360	0.7340
1.1267	50.0	900	1.2204	0.7364
1.1209	51.0	918	1.1856	0.7412
1.1189	52.0	936	1.2348	0.7392
1.1003	53.0	954	1.2336	0.7392
1.1135	54.0	972	1.2546	0.7305
1.1371	55.0	990	1.2513	0.7370
1.1296	56.0	1008	1.2336	0.7362
1.1069	57.0	1026	1.2388	0.7335
1.1209	58.0	1044	1.2260	0.7400
1.1063	59.0	1062	1.2323	0.7326
1.0932	60.0	1080	1.2461	0.7359
1.1181	61.0	1098	1.2514	0.7362
1.1064	62.0	1116	1.2686	0.7327
1.114	63.0	1134	1.2336	0.7368
1.0949	64.0	1152	1.2549	0.7335
1.1126	65.0	1170	1.2574	0.7310
1.1272	66.0	1188	1.2064	0.7383
1.1063	67.0	1206	1.2451	0.7353
1.117	68.0	1224	1.2730	0.7311
1.1044	69.0	1242	1.2430	0.7367
1.0865	70.0	1260	1.1804	0.7399
1.1063	71.0	1278	1.2278	0.7370
1.1077	72.0	1296	1.1938	0.7443
1.0968	73.0	1314	1.2550	0.7342
1.1064	74.0	1332	1.2218	0.7397
1.1149	75.0	1350	1.2417	0.7349
1.1029	76.0	1368	1.2269	0.7407
1.0861	77.0	1386	1.2293	0.7389
1.1095	78.0	1404	1.2304	0.7389
1.0984	79.0	1422	1.2085	0.7405
1.0961	80.0	1440	1.2367	0.7350
1.0942	81.0	1458	1.2208	0.7366
1.1	82.0	1476	1.2347	0.7342
1.0982	83.0	1494	1.2202	0.7345
1.0888	84.0	1512	1.2286	0.7421
1.0862	85.0	1530	1.2026	0.7425
1.0719	86.0	1548	1.2488	0.7322
1.0829	87.0	1566	1.2318	0.7378
1.1046	88.0	1584	1.2533	0.7365
1.0913	89.0	1602	1.2501	0.7368
1.0901	90.0	1620	1.2471	0.7324
1.09	91.0	1638	1.2298	0.7403
1.1019	92.0	1656	1.2418	0.7307
1.0809	93.0	1674	1.2007	0.7426
1.0874	94.0	1692	1.2618	0.7287
1.0926	94.4571	1700	1.2288	0.7394

Framework versions

Transformers 4.48.2
Pytorch 2.4.1+cu121
Datasets 3.0.2
Tokenizers 0.21.0

Downloads last month: 2

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Mardiyyah/variant-tapt_ulmfit_whole_word-LR_2e-05

Base model

microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext

Finetuned

(148)

this model

Mardiyyah
/

variant-tapt_ulmfit_whole_word-LR_2e-05