DeepSeek-V3 architecture with 4 layers + 8 experts per MoE + MTP module + FP8 weights from original model without further tuning
To be used in CI testing
- Downloads last month
- 16,187
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support