Cursed Toxic Pretraining Corpora
updated
mavinsao/reddit-mental-illness-82
Viewer
• Updated • 52.6k • 39
• 4
Viewer
• Updated • 2.17k • 41
• 3
RentonWEB3/reddit_dataset_193
Viewer
• Updated • 110k • 9
• 1
Updated • 521
• 9
hugginglearners/reddit-depression-cleaned
Viewer
• Updated • 7.73k • 138
• 1
chloeliu/reddit_nosleep_posts
Viewer
• Updated • 610 • 6
• 1
Viewer
• Updated • 4.41k • 90
• 40
gmongaras/reddit_negative
Viewer
• Updated • 4.86k • 7
Viewer
• Updated • 213k • 57
• 2
yoonholee/reddit_TwoSentencePlotTwist_1575
Viewer
• Updated • 1.58k • 11
ve-nk-at/reddit_comment_violation_data_set
Viewer
• Updated • 2.03k • 12
Viewer
• Updated • 11.9k • 38
• 3
DuckyBlender/racist-dataset
Viewer
• Updated • 1.31k • 17
• 6
taylorgordon/antisemitism_weak_labeling
Viewer
• Updated • 3.15k • 12
JoshMcGiff/HomophobiaDetectionTwitterX
Viewer
• Updated • 1.28k • 25
• 3
SetFit/hate_speech_offensive
Viewer
• Updated • 24.8k • 268
• 2
badmatr11x/hate-offensive-speech
Viewer
• Updated • 56.7k • 124
• 5
Intuit-GenSRF/hate-speech-offensive
Viewer
• Updated • 24.8k • 13
ctoraman/gender-hate-speech
Viewer
• Updated • 20k • 65
• 3
fuzzy-g/4chan_pol_whole_ds
Viewer
• Updated • 4.09M • 4
• 1
Skorcht/inceldatabaseTHISWASASUGGESTION
Viewer
• Updated • 2.49k • 9
• 1
Viewer
• Updated • 144k • 5
• 5
HuggingFaceTB/SmolVLM-Instruct
Image-Text-to-Text
• 2B • Updated • 27.9k
• 585
MultiverseComputingCAI/LittleLamb
Text Generation
• Updated • 382
• 3