Imagination Helps Visual Reasoning, But Not Yet in Latent Space Paper • 2602.22766 • Published 21 days ago • 42
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published Feb 3 • 47
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published Feb 13 • 57
Next Embedding Prediction Makes World Models Stronger Paper • 2603.02765 • Published 16 days ago • 20
view article Article NEO-unify: Building Native Multimodal Unified Models End to End 14 days ago • 98
NLE: Non-autoregressive LLM-based ASR by Transcript Editing Paper • 2603.08397 • Published 10 days ago • 21
Weak-SIGReg: Covariance Regularization for Stable Deep Learning Paper • 2603.05924 • Published 13 days ago • 1
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published Dec 23, 2025 • 62
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published Dec 16, 2025 • 42
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 133
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale Paper • 2510.14979 • Published Oct 16, 2025 • 69
Artificial Hippocampus Networks for Efficient Long-Context Modeling Paper • 2510.07318 • Published Oct 8, 2025 • 32