JoyAI-LLM Flash-Base

Hugging Face License

1. Model Introduction

JoyAI-LLM Flash-Base is a state-of-the-art mixture-of-experts (MoE) language model with 3 billion activated parameters and 48 billion total parameters. Trained with the Muon optimizer, JoyAI Flash-base achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities. JoyAI-LLM Flash series aim to accelarate high-throughput, latency-sensitive applications where cost per query must remain minimal.

Key Features

  • Training-Inference Collaboration: apply Muon optimizer with dense MTP, develop novel optimization techniques to resolve instabilities while scaling up, delivering 1.3× to 1.7× the throughput of the non-MTP version.
  • Agentic Intelligence: Specifically designed for tool use, reasoning, and autonomous problem-solving.

2. Model Summary

Architecture Mixture-of-Experts (MoE)
Total Parameters 48B
Activated Parameters 3B
Number of Layers (Dense layer included) 40
Number of Dense Layers 1
Attention Hidden Dimension 2048
MoE Hidden Dimension (per Expert) 768
Number of Attention Heads 32
Number of Experts 256
Selected Experts per Token 8
Number of Shared Experts 1
Vocabulary Size 129K
Context Length 128K
Attention Mechanism MLA
Activation Function SwiGLU

3. Evaluation Results

Benchmark JoyAI-LLM Flash-base Qwen3-30B-A3B-base
MMLU 84.70 82.12
MMLU-Pro 73.14 61.76
CMMLU 83.09 83.60
HumanEval 85.37 87.80
LiveCodeBench 39.91 37.34
GSM8K 88.78 90.37
MATH 78.16 59.60
MATH 500 77.00 58.00

4. License

Both the code repository and the model weights are released under the Modified MIT License.

Downloads last month
37
Safetensors
Model size
49B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including jdopensource/JoyAI-LLM-Flash-Base