shuoxing/llama3-8b-full-pretrain-wash-c4-2-4m-sft-bs64 Text Generation • 8B • Updated 4 days ago • 187
shuoxing/llama3-8b-full-pretrain-wash-c4-2-1m-sft-bs64 Text Generation • 8B • Updated 4 days ago • 203
shuoxing/llama3-8b-full-pretrain-wash-c4-1-8m-sft-bs64 Text Generation • 8B • Updated 4 days ago • 218
shuoxing/llama3-8b-full-pretrain-wash-c4-1-5m-sft-bs64 Text Generation • 8B • Updated 4 days ago • 246
shuoxing/llama3-8b-full-pretrain-wash-c4-1-2m-sft-bs64 Text Generation • 8B • Updated 4 days ago • 250
shuoxing/llama3-8b-full-pretrain-wash-c4-0-9m-sft-bs64 Text Generation • 8B • Updated 4 days ago • 271
shuoxing/llama3-8b-full-pretrain-wash-c4-0-6m-sft-bs64 Text Generation • 8B • Updated 4 days ago • 280
shuoxing/llama3-8b-full-pretrain-wash-c4-0-3m-sft-bs64 Text Generation • 8B • Updated 4 days ago • 283
shuoxing/qwen2-5-7b-full-sft-control-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 26
shuoxing/qwen2-5-7b-full-sft-mix-high-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 26 • 2
shuoxing/qwen2-5-7b-full-sft-mix-mid-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 25 • 1
shuoxing/qwen2-5-7b-full-sft-mix-low-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 25 • 3
shuoxing/qwen3-4b-full-sft-control-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25 • 2
shuoxing/qwen3-4b-full-sft-mix-high-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25 • 3
shuoxing/qwen3-4b-full-sft-mix-mid-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25