Block Diffusion for Flash Speculative Decoding
Z Lab
university
AI & ML interests
Efficient AI
Recent Activity
View all activity
Papers
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
models
22
z-lab/Qwen3-Coder-30B-A3B-DFlash
Text Generation
•
Updated
•
1k
•
26
z-lab/LLaMA3.1-8B-Instruct-DFlash-UltraChat
Text Generation
•
1B
•
Updated
•
95
•
2
z-lab/Qwen3-4B-DFlash-b16
Text Generation
•
0.5B
•
Updated
•
6.4k
•
22
z-lab/Qwen3-8B-DFlash-b16
Text Generation
•
Updated
•
6.11k
•
19
z-lab/Qwen3-4B-Thinking-2507-PARO
1B
•
Updated
z-lab/DeepSeek-R1-Distill-Llama-8B-PARO
1B
•
Updated
•
4
z-lab/Meta-Llama-3-70B-PARO
20B
•
Updated
•
3
z-lab/Qwen3-14B-PARO
2B
•
Updated
z-lab/Qwen3-14B-Base-PARO
2B
•
Updated
z-lab/Qwen3-8B-PARO
1B
•
Updated
•
393