Text Generation
Transformers
Safetensors
llada2_moe
conversational
custom_code
nielsr HF Staff commited on
Commit
65976a5
·
verified ·
1 Parent(s): 64c8f21

Add pipeline tag, library metadata, and improve model card

Browse files

Hi! I'm Niels, part of the community science team at Hugging Face.

This PR improves the model card for DMax-Math-16B:
- Adds `pipeline_tag: text-generation` to ensure the model is correctly categorized on the Hub.
- Adds `library_name: transformers` as the model is compatible with the library (as shown in the sample code), which enables automatic code snippets.
- Links the model to its associated paper: [DMax: Aggressive Parallel Decoding for dLLMs](https://huggingface.co/papers/2604.08302).
- Maintains existing license, base model, and dataset metadata.

Files changed (1) hide show
  1. README.md +24 -10
README.md CHANGED
@@ -1,9 +1,11 @@
1
  ---
2
- license: apache-2.0
3
- datasets:
4
- - Zigeng/DMax-LLaDA-2.0-Mini-Math-Trajectories
5
  base_model:
6
  - inclusionAI/LLaDA2.0-mini
 
 
 
 
 
7
  ---
8
 
9
  <div align="center">
@@ -12,7 +14,7 @@ base_model:
12
  <a href="https://github.com/czg1225/DMax/blob/main/LICENSE">
13
  <img alt="Apache" src="https://img.shields.io/badge/License-Apache-4E94CE.svg">
14
  </a>
15
- <a href="https://arxiv.org/pdf/2604.08302">
16
  <img src="https://img.shields.io/badge/Paper-Arxiv-darkred.svg" alt="Paper">
17
  </a>
18
  <a href="https://github.com/czg1225/DMax">
@@ -21,10 +23,9 @@ base_model:
21
  </div>
22
  </div>
23
 
24
- > **DMax: Aggressive Parallel Decoding for dLLMs**
25
- > [Zigeng Chen](https://czg1225.github.io/chenzigeng99/), [Gongfan Fang](https://fangggf.github.io/), [Xinyin Ma](https://horseee.github.io/), [Ruonan Yu](https://scholar.google.com/citations?user=UHP95egAAAAJ&hl=en), [Xinchao Wang](https://sites.google.com/site/sitexinchaowang/)
26
- > [xML Lab](https://sites.google.com/view/xml-nus), National University of Singapore
27
 
 
28
 
29
  ## 💪 Highlights
30
 
@@ -65,7 +66,9 @@ model = model.to(torch.bfloat16)
65
  model.eval()
66
  tokenizer = AutoTokenizer.from_pretrained("Zigeng/DMax-Math-16B", trust_remote_code=True)
67
 
68
- prompt = "A robe takes 2 bolts of blue fiber and half that much white fiber. How many bolts in total does it take?" + "\nLet's think step by step\n"
 
 
69
 
70
  input_ids = tokenizer.apply_chat_template(
71
  [{"role": "user", "content": prompt}],
@@ -94,5 +97,16 @@ print("nfe:",nfe,"token length",len(generated_tokens[0]))
94
 
95
  ![trade-off](assets/exp.png)
96
 
97
-
98
-
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
2
  base_model:
3
  - inclusionAI/LLaDA2.0-mini
4
+ datasets:
5
+ - Zigeng/DMax-LLaDA-2.0-Mini-Math-Trajectories
6
+ license: apache-2.0
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
  ---
10
 
11
  <div align="center">
 
14
  <a href="https://github.com/czg1225/DMax/blob/main/LICENSE">
15
  <img alt="Apache" src="https://img.shields.io/badge/License-Apache-4E94CE.svg">
16
  </a>
17
+ <a href="https://arxiv.org/abs/2604.08302">
18
  <img src="https://img.shields.io/badge/Paper-Arxiv-darkred.svg" alt="Paper">
19
  </a>
20
  <a href="https://github.com/czg1225/DMax">
 
23
  </div>
24
  </div>
25
 
26
+ This repository contains the weights for **DMax-Math-16B**, presented in the paper [DMax: Aggressive Parallel Decoding for dLLMs](https://huggingface.co/papers/2604.08302).
 
 
27
 
28
+ DMax is a new paradigm for efficient diffusion language models (dLLMs) that mitigates error accumulation in parallel decoding, enabling aggressive decoding parallelism while preserving generation quality.
29
 
30
  ## 💪 Highlights
31
 
 
66
  model.eval()
67
  tokenizer = AutoTokenizer.from_pretrained("Zigeng/DMax-Math-16B", trust_remote_code=True)
68
 
69
+ prompt = "A robe takes 2 bolts of blue fiber and half that much white fiber. How many bolts in total does it take?" + "
70
+ Let's think step by step
71
+ "
72
 
73
  input_ids = tokenizer.apply_chat_template(
74
  [{"role": "user", "content": prompt}],
 
97
 
98
  ![trade-off](assets/exp.png)
99
 
100
+ ## 📚 Citation
101
+
102
+ ```bibtex
103
+ @misc{chen2026dmaxaggressiveparalleldecoding,
104
+ title={DMax: Aggressive Parallel Decoding for dLLMs},
105
+ author={Zigeng Chen and Gongfan Fang and Xinyin Ma and Ruonan Yu and Xinchao Wang},
106
+ year={2026},
107
+ eprint={2604.08302},
108
+ archivePrefix={arXiv},
109
+ primaryClass={cs.LG},
110
+ url={https://arxiv.org/abs/2604.08302},
111
+ }
112
+ ```