Add comprehensive model card for dUltra

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ pipeline_tag: text-generation
4
+ ---
5
+
6
+ # dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning
7
+
8
+ dUltra is an on-policy reinforcement learning framework based on Group Relative Policy Optimization (GRPO) that learns unmasking strategies for efficient parallel decoding in Masked Diffusion Language Models (MDLMs). By training an unmasking planner head, dUltra enables diffusion language models to achieve state-of-the-art performance in terms of accuracy and efficiency trade-offs.
9
+
10
+ - **Paper:** [dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning](https://huggingface.co/papers/2512.21446)
11
+ - **GitHub Repository:** [https://github.com/chinsengi/dUltra-os](https://github.com/chinsengi/dUltra-os)
12
+
13
+ ## Model Description
14
+
15
+ Masked diffusion language models offer the potential for parallel token generation. dUltra introduces an unmasking planner head that predicts per-token unmasking likelihoods under independent Bernoulli distributions. The framework jointly optimizes the base diffusion LLM and the unmasking order planner using reward signals combining verifiable reward, distillation reward, and the number of unmasking steps. dUltra achieves superior accuracy-efficiency trade-offs across mathematical reasoning and code generation tasks.
16
+
17
+ ## Usage
18
+
19
+ To use the dUltra model, you can load it with the `transformers` library. Note that `trust_remote_code=True` is required to load the custom model architecture.
20
+
21
+ ```python
22
+ import torch
23
+ from model.llada.lladou import LLaDOUModelLM
24
+ from transformers import AutoTokenizer
25
+
26
+ model = LLaDOUModelLM.from_pretrained(
27
+ "sengi/dUltra-math",
28
+ trust_remote_code=True,
29
+ torch_dtype=torch.bfloat16,
30
+ )
31
+ tokenizer = AutoTokenizer.from_pretrained("sengi/dUltra-math")
32
+ ```
33
+
34
+ ## Citation
35
+
36
+ ```bibtex
37
+ @misc{chen2025dultraultrafastdiffusionlanguage,
38
+ title={dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning},
39
+ author={Shirui Chen and Jiantao Jiao and Lillian J. Ratliff and Banghua Zhu},
40
+ year={2025},
41
+ eprint={2512.21446},
42
+ archivePrefix={arXiv},
43
+ primaryClass={cs.LG},
44
+ url={https://arxiv.org/abs/2512.21446},
45
+ }
46
+ ```