Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Buckets:
HCAI-Lab
/
dolma3-6t-sample-5000-docs
Follow
Human-Centered AI Lab
9
Files
xet
HCAI-Lab/dolma3-6t-sample-5000-docs
/
worker_0068
11.1 GB
56,043 files
Updated about 1 month ago
Ctrl+K
Name
Size
Uploaded
Xet hash
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0018__shard_00000676.jsonl.zst
470 kB
xet
about 1 month ago
f3ffac56
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0018__shard_00000680.jsonl.zst
462 kB
xet
about 1 month ago
a1f057f8
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0018__shard_00000682.jsonl.zst
506 kB
xet
about 1 month ago
8162b1e9
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0018__shard_00000696.jsonl.zst
635 kB
xet
about 1 month ago
fa60c16c
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0018__shard_00000697.jsonl.zst
364 kB
xet
about 1 month ago
30fc619d
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0018__shard_00000705.jsonl.zst
443 kB
xet
about 1 month ago
bf548b15
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0018__shard_00000710.jsonl.zst
262 kB
xet
about 1 month ago
563a4b41
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000002.jsonl.zst
523 kB
xet
about 1 month ago
661815a6
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000003.jsonl.zst
587 kB
xet
about 1 month ago
ca606677
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000007.jsonl.zst
467 kB
xet
about 1 month ago
6e780a19
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000008.jsonl.zst
460 kB
xet
about 1 month ago
c7df84e1
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000009.jsonl.zst
439 kB
xet
about 1 month ago
fdb388bc
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000010.jsonl.zst
421 kB
xet
about 1 month ago
3a7e34c7
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000011.jsonl.zst
476 kB
xet
about 1 month ago
3f3e404c
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000015.jsonl.zst
512 kB
xet
about 1 month ago
71e21a24
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000016.jsonl.zst
403 kB
xet
about 1 month ago
f968897d
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000017.jsonl.zst
473 kB
xet
about 1 month ago
1fc780fe
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000018.jsonl.zst
662 kB
xet
about 1 month ago
ff058f40
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000019.jsonl.zst
450 kB
xet
about 1 month ago
49951433
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000021.jsonl.zst
458 kB
xet
about 1 month ago
8e968b47
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000023.jsonl.zst
511 kB
xet
about 1 month ago
8ce51249
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000027.jsonl.zst
501 kB
xet
about 1 month ago
bc9a11e6
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000028.jsonl.zst
600 kB
xet
about 1 month ago
c92ba5aa
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000029.jsonl.zst
495 kB
xet
about 1 month ago
2db8d267
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000030.jsonl.zst
549 kB
xet
about 1 month ago
a4a0d7c2
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000031.jsonl.zst
507 kB
xet
about 1 month ago
d5d45b73
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000032.jsonl.zst
640 kB
xet
about 1 month ago
10f2f4b2
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000037.jsonl.zst
505 kB
xet
about 1 month ago
b16cc76e
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000038.jsonl.zst
500 kB
xet
about 1 month ago
844b59ed
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000039.jsonl.zst
483 kB
xet
about 1 month ago
6aa19e10
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000041.jsonl.zst
424 kB
xet
about 1 month ago
6556ba4f
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000042.jsonl.zst
441 kB
xet
about 1 month ago
19d6bdcf
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000043.jsonl.zst
437 kB
xet
about 1 month ago
9eb7aba3
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000044.jsonl.zst
450 kB
xet
about 1 month ago
f227512a
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000045.jsonl.zst
427 kB
xet
about 1 month ago
e51113be
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000047.jsonl.zst
408 kB
xet
about 1 month ago
74724e66
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000048.jsonl.zst
498 kB
xet
about 1 month ago
1f5f9666
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000049.jsonl.zst
402 kB
xet
about 1 month ago
ebe51a3d
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000050.jsonl.zst
518 kB
xet
about 1 month ago
905d15e3
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000051.jsonl.zst
546 kB
xet
about 1 month ago
d87f89e6
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000053.jsonl.zst
469 kB
xet
about 1 month ago
31a4ad78
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000054.jsonl.zst
395 kB
xet
about 1 month ago
6206752d
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000055.jsonl.zst
407 kB
xet
about 1 month ago
0a8f8453
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000056.jsonl.zst
601 kB
xet
about 1 month ago
0529d7af
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000058.jsonl.zst
451 kB
xet
about 1 month ago
6525f083
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000060.jsonl.zst
576 kB
xet
about 1 month ago
154fb6a6
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000062.jsonl.zst
433 kB
xet
about 1 month ago
aa99c8f1
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000064.jsonl.zst
581 kB
xet
about 1 month ago
543fd1f7
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000065.jsonl.zst
530 kB
xet
about 1 month ago
296396df
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000067.jsonl.zst
460 kB
xet
about 1 month ago
2e0bc185
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000068.jsonl.zst
486 kB
xet
about 1 month ago
11420b09
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000069.jsonl.zst
414 kB
xet
about 1 month ago
d8b7add4
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000070.jsonl.zst
432 kB
xet
about 1 month ago
f793bfd8
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000072.jsonl.zst
420 kB
xet
about 1 month ago
ae6cedc5
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000075.jsonl.zst
438 kB
xet
about 1 month ago
5bf9ff34
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000078.jsonl.zst
466 kB
xet
about 1 month ago
17891780
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000079.jsonl.zst
515 kB
xet
about 1 month ago
9bb7fd42
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000080.jsonl.zst
565 kB
xet
about 1 month ago
df88fb3c
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000085.jsonl.zst
432 kB
xet
about 1 month ago
3b0ecbdf
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000086.jsonl.zst
229 kB
xet
about 1 month ago
565acaa3
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000088.jsonl.zst
449 kB
xet
about 1 month ago
cd27bfab
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000089.jsonl.zst
485 kB
xet
about 1 month ago
76530910
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000091.jsonl.zst
556 kB
xet
about 1 month ago
cddf1530
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000093.jsonl.zst
391 kB
xet
about 1 month ago
910f08ae
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000095.jsonl.zst
601 kB
xet
about 1 month ago
41d40306
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000097.jsonl.zst
572 kB
xet
about 1 month ago
347e0043
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000098.jsonl.zst
489 kB
xet
about 1 month ago
62ede7dc
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000099.jsonl.zst
426 kB
xet
about 1 month ago
74bfc7ff
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000100.jsonl.zst
443 kB
xet
about 1 month ago
24a9959c
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000102.jsonl.zst
498 kB
xet
about 1 month ago
1139c57f
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000105.jsonl.zst
426 kB
xet
about 1 month ago
edd7b025
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000106.jsonl.zst
508 kB
xet
about 1 month ago
c6375740
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000107.jsonl.zst
421 kB
xet
about 1 month ago
fd95a234
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000109.jsonl.zst
407 kB
xet
about 1 month ago
e7bd43c5
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000113.jsonl.zst
455 kB
xet
about 1 month ago
1a657255
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000114.jsonl.zst
464 kB
xet
about 1 month ago
144cbe67
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000117.jsonl.zst
481 kB
xet
about 1 month ago
1e7c2c8b
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000120.jsonl.zst
490 kB
xet
about 1 month ago
6a3e41fe
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000122.jsonl.zst
578 kB
xet
about 1 month ago
4af20a56
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000123.jsonl.zst
474 kB
xet
about 1 month ago
d6d67d6b
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000125.jsonl.zst
479 kB
xet
about 1 month ago
96b1c2a5
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000126.jsonl.zst
442 kB
xet
about 1 month ago
5b035b8c
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000128.jsonl.zst
506 kB
xet
about 1 month ago
0ea6d435
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000129.jsonl.zst
560 kB
xet
about 1 month ago
62c2dc6b
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000130.jsonl.zst
465 kB
xet
about 1 month ago
b37a5f50
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000131.jsonl.zst
543 kB
xet
about 1 month ago
139eff50
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000132.jsonl.zst
434 kB
xet
about 1 month ago
1714c5bf
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000133.jsonl.zst
547 kB
xet
about 1 month ago
6c6fe451
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000135.jsonl.zst
427 kB
xet
about 1 month ago
2d35f1c9
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000136.jsonl.zst
488 kB
xet
about 1 month ago
76e22464
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000137.jsonl.zst
630 kB
xet
about 1 month ago
ace5916b
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000138.jsonl.zst
533 kB
xet
about 1 month ago
50552cc0
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000140.jsonl.zst
581 kB
xet
about 1 month ago
4ab9c689
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000142.jsonl.zst
519 kB
xet
about 1 month ago
8d06a102
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000143.jsonl.zst
409 kB
xet
about 1 month ago
2b00c421
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000148.jsonl.zst
582 kB
xet
about 1 month ago
e6a73a42
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000150.jsonl.zst
463 kB
xet
about 1 month ago
22aaf30e
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000151.jsonl.zst
521 kB
xet
about 1 month ago
d579b56a
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000152.jsonl.zst
496 kB
xet
about 1 month ago
cfec17b0
soc127__phase1_pool_shared__common_crawl__part_005__data__common_crawl-social_life-0019__shard_00000154.jsonl.zst
527 kB
xet
about 1 month ago
cf7d0104
Load more
Use this bucket
Total size
11.1 GB
Files
56,043
Last updated
Mar 24
Pre-warmed CDN
US
EU
US
EU
Contributors