OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation Paper • 2601.15369 • Published 22 days ago • 20
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model Paper • 2601.15892 • Published 21 days ago • 53
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published 21 days ago • 51
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems Paper • 2601.11004 • Published 28 days ago • 30
Behavior Knowledge Merge in Reinforced Agentic Models Paper • 2601.13572 • Published 24 days ago • 24
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published 20 days ago • 32
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs Paper • 2601.17058 • Published 21 days ago • 188
Less is More: Optimizing Function Calling for LLM Execution on Edge Devices Paper • 2411.15399 • Published Nov 23, 2024 • 1
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published 14 days ago • 68
Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models Paper • 2601.20354 • Published 16 days ago • 110
Generation Enhances Understanding in Unified Multimodal Models via Multi-Representation Generation Paper • 2601.21406 • Published 15 days ago • 5
ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation Paper • 2601.21420 • Published 15 days ago • 42
DINO-SAE: DINO Spherical Autoencoder for High-Fidelity Image Reconstruction and Generation Paper • 2601.22904 • Published 13 days ago • 15
ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought Paper • 2601.23184 • Published 13 days ago • 35
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space Paper • 2602.02092 • Published 10 days ago • 18
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss Paper • 2602.02493 • Published 10 days ago • 41
TTCS: Test-Time Curriculum Synthesis for Self-Evolving Paper • 2601.22628 • Published 14 days ago • 34
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System Paper • 2602.02488 • Published 10 days ago • 32
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 10 days ago • 125
Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization Paper • 2601.21358 • Published 15 days ago • 7
Balancing Understanding and Generation in Discrete Diffusion Models Paper • 2602.01362 • Published 11 days ago • 14
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation Paper • 2602.03796 • Published 9 days ago • 55
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published 11 days ago • 92
Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers Paper • 2602.03510 • Published 9 days ago • 27
RISE-Video: Can Video Generators Decode Implicit World Rules? Paper • 2602.05986 • Published 7 days ago • 26
DFlash: Block Diffusion for Flash Speculative Decoding Paper • 2602.06036 • Published 7 days ago • 40
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published 3 days ago • 37
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning Paper • 2602.08236 • Published 4 days ago • 7
AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research Paper • 2602.06540 • Published 6 days ago • 20
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models Paper • 2602.04649 • Published 8 days ago • 10
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 8 days ago • 294
AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders Paper • 2602.05027 • Published 8 days ago • 59
Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math Paper • 2602.06291 • Published 7 days ago • 22