DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models Paper • 2603.23499 • Published 1 day ago • 38
TrajLoom: Dense Future Trajectory Generation from Video Paper • 2603.22606 • Published 2 days ago • 5
VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models Paper • 2603.22003 • Published 3 days ago • 10
Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos Paper • 2603.22529 • Published 2 days ago • 4
ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model Paper • 2603.22281 • Published 3 days ago • 12
Repurposing Geometric Foundation Models for Multi-view Diffusion Paper • 2603.22275 • Published 3 days ago • 40
PEARL: Personalized Streaming Video Understanding Model Paper • 2603.20422 • Published 6 days ago • 36
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding Paper • 2603.22458 • Published 3 days ago • 113
Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck Paper • 2603.08462 • Published 17 days ago • 21
view article Article Introducing SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding 7 days ago • 43
Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs Paper • 2603.16932 • Published 12 days ago • 76
mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT Paper • 2603.21606 • Published 3 days ago • 35
PivotRL: High Accuracy Agentic Post-Training at Low Compute Cost Paper • 2603.21383 • Published 3 days ago • 14
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding Paper • 2603.22285 • Published 3 days ago • 45
Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD Paper • 2603.20155 • Published 6 days ago • 7
view article Article **LoRA Fine-Tuning BitNet b1.58 LLMs on Heterogeneous Edge GPUs via QVAC Fabric** 9 days ago • 14