Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
sumitdotml
/
moe-emergence
like
0
Text Generation
Transformers
Safetensors
codeparrot/codeparrot-clean
allenai/ai2_arc
allenai/c4
English
mixture-of-experts
gpt2
research
expert-specialization
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
moe-emergence
17.3 GB
Ctrl+K
Ctrl+K
2 contributors
History:
18 commits
sumit
updated model card with ablation results and all 4 runs
4049aa7
about 1 month ago
dense-baseline
add dense and moe checkpoints
about 1 month ago
moe-main
add dense and moe checkpoints
about 1 month ago
no-lb-ablation
Upload no-lb-ablation/ckpt-step-500.pt with huggingface_hub
about 1 month ago
top2-main-10k
Upload top2-main-10k/ckpt-step-9999.pt with huggingface_hub
about 1 month ago
.gitattributes
Safe
1.52 kB
add dense and moe checkpoints
about 1 month ago
README.md
7.09 kB
updated model card with ablation results and all 4 runs
about 1 month ago