SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 8 days ago • 33
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 8 days ago • 33
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 8 days ago • 49
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 8 days ago • 49
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 8 days ago • 49
Geometry-Aware Rotary Position Embedding for Consistent Video World Model Paper • 2602.07854 • Published 13 days ago • 8
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization Paper • 2602.02958 • Published 19 days ago • 33
World Simulation with Video Foundation Models for Physical AI Paper • 2511.00062 • Published Oct 28, 2025 • 44
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95