Xiang Zhang
fancyzhx
AI & ML interests
None yet
Organizations
None yet
Video Datasets
Text Datasets
- Running133
TxT360: Trillion Extracted Text
📖133Explore and download the TxT360 LLM pre‑training dataset
-
CASIA-LM/ChineseWebText2.0
Viewer • Updated • 2k • 3.3k • 28 -
HPLT/HPLT2.0_cleaned
Viewer • Updated • 9.03B • 25.4k • 36 -
TrevorDohm/Pile_Tokenized
Viewer • Updated • 134M • 29
Audio Datasets
Robotic Datasets
Video Datasets
Image Datasets
Text Datasets
- Running133
TxT360: Trillion Extracted Text
📖133Explore and download the TxT360 LLM pre‑training dataset
-
CASIA-LM/ChineseWebText2.0
Viewer • Updated • 2k • 3.3k • 28 -
HPLT/HPLT2.0_cleaned
Viewer • Updated • 9.03B • 25.4k • 36 -
TrevorDohm/Pile_Tokenized
Viewer • Updated • 134M • 29