view article Article BidirLM: Turning Generative LLMs into the Best Open-Source Omnimodal Encoders 4 days ago • 23
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision Paper • 2604.04934 • Published 6 days ago • 36
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 5 days ago • 107
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published 6 days ago • 224
AURA: Always-On Understanding and Real-Time Assistance via Video Streams Paper • 2604.04184 • Published 7 days ago • 43
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 6 days ago • 99
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published 22 days ago • 330
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 10 days ago • 34
RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models Paper • 2603.25502 • Published 16 days ago • 56
Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration Paper • 2603.24800 • Published 17 days ago • 67
PixelSmile: Toward Fine-Grained Facial Expression Editing Paper • 2603.25728 • Published 16 days ago • 117
view article Article Introducing Cohere-transcribe: state-of-the-art speech recognition 16 days ago • 36