Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Buckets:
HCAI-Lab
/
dolma3-6t-sample-5000-docs
Follow
Human-Centered AI Lab
9
Files
xet
HCAI-Lab/dolma3-6t-sample-5000-docs
/
worker_0040
11.1 GB
56,043 files
Updated about 1 month ago
Ctrl+K
Name
Size
Uploaded
Xet hash
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000125.jsonl.zst
561 kB
xet
about 1 month ago
7c3a927a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000126.jsonl.zst
435 kB
xet
about 1 month ago
64af5f08
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000127.jsonl.zst
512 kB
xet
about 1 month ago
0618eb34
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000128.jsonl.zst
489 kB
xet
about 1 month ago
0d973ea8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000129.jsonl.zst
514 kB
xet
about 1 month ago
4233230e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000130.jsonl.zst
612 kB
xet
about 1 month ago
46a05b01
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000131.jsonl.zst
381 kB
xet
about 1 month ago
1af133e3
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000132.jsonl.zst
529 kB
xet
about 1 month ago
eeea6645
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000133.jsonl.zst
472 kB
xet
about 1 month ago
e1f51035
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000134.jsonl.zst
729 kB
xet
about 1 month ago
05578114
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000135.jsonl.zst
454 kB
xet
about 1 month ago
cc1eafb0
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000136.jsonl.zst
550 kB
xet
about 1 month ago
b35b5b9f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000137.jsonl.zst
368 kB
xet
about 1 month ago
14db70ba
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000138.jsonl.zst
481 kB
xet
about 1 month ago
03280ac3
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000139.jsonl.zst
365 kB
xet
about 1 month ago
ca72b541
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000140.jsonl.zst
515 kB
xet
about 1 month ago
999e92ad
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000141.jsonl.zst
446 kB
xet
about 1 month ago
d64cf95d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000142.jsonl.zst
427 kB
xet
about 1 month ago
933e5a17
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000143.jsonl.zst
499 kB
xet
about 1 month ago
06744c9f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000144.jsonl.zst
465 kB
xet
about 1 month ago
243ea17d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000145.jsonl.zst
394 kB
xet
about 1 month ago
71733dbc
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000146.jsonl.zst
485 kB
xet
about 1 month ago
fbda99c8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000147.jsonl.zst
468 kB
xet
about 1 month ago
8d772d31
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000148.jsonl.zst
429 kB
xet
about 1 month ago
0617e1b7
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000149.jsonl.zst
485 kB
xet
about 1 month ago
45ca8627
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000150.jsonl.zst
431 kB
xet
about 1 month ago
8d9ff97b
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000151.jsonl.zst
433 kB
xet
about 1 month ago
d8007dd6
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000152.jsonl.zst
459 kB
xet
about 1 month ago
6d8dc55a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000153.jsonl.zst
416 kB
xet
about 1 month ago
a6581294
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000154.jsonl.zst
579 kB
xet
about 1 month ago
ec4d82cd
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000155.jsonl.zst
462 kB
xet
about 1 month ago
b1869ba7
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000156.jsonl.zst
569 kB
xet
about 1 month ago
7bc3c837
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000157.jsonl.zst
293 kB
xet
about 1 month ago
4a336ef4
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000158.jsonl.zst
227 kB
xet
about 1 month ago
ef944703
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000159.jsonl.zst
532 kB
xet
about 1 month ago
af3d90c6
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000160.jsonl.zst
494 kB
xet
about 1 month ago
21f39536
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000161.jsonl.zst
447 kB
xet
about 1 month ago
768cacf1
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000162.jsonl.zst
502 kB
xet
about 1 month ago
11379e96
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000163.jsonl.zst
537 kB
xet
about 1 month ago
b597a469
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000164.jsonl.zst
488 kB
xet
about 1 month ago
e8f4c445
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000165.jsonl.zst
424 kB
xet
about 1 month ago
7c6fd1fe
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000166.jsonl.zst
624 kB
xet
about 1 month ago
1e7e4bf4
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000167.jsonl.zst
511 kB
xet
about 1 month ago
f9eb9f45
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000168.jsonl.zst
356 kB
xet
about 1 month ago
614dcf1e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000169.jsonl.zst
505 kB
xet
about 1 month ago
e86418f3
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000170.jsonl.zst
465 kB
xet
about 1 month ago
5029ec86
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000171.jsonl.zst
278 kB
xet
about 1 month ago
1a356c12
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000172.jsonl.zst
441 kB
xet
about 1 month ago
90dbe39f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000173.jsonl.zst
348 kB
xet
about 1 month ago
3f560b70
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000174.jsonl.zst
385 kB
xet
about 1 month ago
cbf11240
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000175.jsonl.zst
597 kB
xet
about 1 month ago
3160858d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000176.jsonl.zst
437 kB
xet
about 1 month ago
6df80305
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000177.jsonl.zst
237 kB
xet
about 1 month ago
61727f3a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000178.jsonl.zst
335 kB
xet
about 1 month ago
9bcabbba
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000179.jsonl.zst
639 kB
xet
about 1 month ago
d84699fb
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000180.jsonl.zst
465 kB
xet
about 1 month ago
d4ec5fc3
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000181.jsonl.zst
518 kB
xet
about 1 month ago
82e88f55
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000182.jsonl.zst
418 kB
xet
about 1 month ago
0e423a9e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000183.jsonl.zst
448 kB
xet
about 1 month ago
070886f5
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000184.jsonl.zst
484 kB
xet
about 1 month ago
38687b8e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000185.jsonl.zst
403 kB
xet
about 1 month ago
9bfd24ed
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000186.jsonl.zst
493 kB
xet
about 1 month ago
24a5125c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000187.jsonl.zst
452 kB
xet
about 1 month ago
f3d4405c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000188.jsonl.zst
463 kB
xet
about 1 month ago
349838ce
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000189.jsonl.zst
406 kB
xet
about 1 month ago
040eac33
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000190.jsonl.zst
528 kB
xet
about 1 month ago
0eb1e94c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000191.jsonl.zst
423 kB
xet
about 1 month ago
895cca09
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000192.jsonl.zst
545 kB
xet
about 1 month ago
e4a8cf76
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000193.jsonl.zst
465 kB
xet
about 1 month ago
7a3bdf60
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000194.jsonl.zst
420 kB
xet
about 1 month ago
3b0b323d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000195.jsonl.zst
450 kB
xet
about 1 month ago
8cdfc4f6
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000196.jsonl.zst
640 kB
xet
about 1 month ago
9232bd9e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000197.jsonl.zst
416 kB
xet
about 1 month ago
d1366152
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000198.jsonl.zst
479 kB
xet
about 1 month ago
1d047827
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000199.jsonl.zst
651 kB
xet
about 1 month ago
17aa0f3f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000200.jsonl.zst
476 kB
xet
about 1 month ago
cfb0d597
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000201.jsonl.zst
529 kB
xet
about 1 month ago
a02bf51c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000202.jsonl.zst
352 kB
xet
about 1 month ago
30b4df53
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000203.jsonl.zst
478 kB
xet
about 1 month ago
b4a2b961
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000204.jsonl.zst
577 kB
xet
about 1 month ago
e03102c1
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000205.jsonl.zst
433 kB
xet
about 1 month ago
565f3852
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000206.jsonl.zst
444 kB
xet
about 1 month ago
b58c2556
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000207.jsonl.zst
467 kB
xet
about 1 month ago
38f8456f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000208.jsonl.zst
536 kB
xet
about 1 month ago
11b399cf
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000209.jsonl.zst
541 kB
xet
about 1 month ago
7a391bb5
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000210.jsonl.zst
456 kB
xet
about 1 month ago
f86a93be
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000211.jsonl.zst
416 kB
xet
about 1 month ago
c7735dd0
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000212.jsonl.zst
533 kB
xet
about 1 month ago
3d367d37
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000213.jsonl.zst
577 kB
xet
about 1 month ago
44f51292
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000214.jsonl.zst
491 kB
xet
about 1 month ago
03ae2927
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000215.jsonl.zst
487 kB
xet
about 1 month ago
68f4eee2
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000216.jsonl.zst
441 kB
xet
about 1 month ago
7334367a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000217.jsonl.zst
538 kB
xet
about 1 month ago
208e22e5
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000218.jsonl.zst
598 kB
xet
about 1 month ago
0305f959
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000219.jsonl.zst
502 kB
xet
about 1 month ago
e971f4c0
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000220.jsonl.zst
385 kB
xet
about 1 month ago
11422cfe
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000221.jsonl.zst
548 kB
xet
about 1 month ago
e271c5de
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000222.jsonl.zst
546 kB
xet
about 1 month ago
72eb74af
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000223.jsonl.zst
371 kB
xet
about 1 month ago
6b5241fe
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000224.jsonl.zst
557 kB
xet
about 1 month ago
13de6511
Load more
Use this bucket
Total size
11.1 GB
Files
56,043
Last updated
Mar 24
Pre-warmed CDN
US
EU
US
EU
Contributors