A Survey of Context Engineering for Large Language Models
Paper
• 2507.13334
• Published • 263
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper
• 2507.15846
• Published • 135
ScreenCoder: Advancing Visual-to-Code Generation for Front-End
Automation via Modular Multimodal Agents
Paper
• 2507.22827
• Published • 101
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility,
Reasoning, and Efficiency
Paper
• 2508.18265
• Published • 217
Group Sequence Policy Optimization
Paper
• 2507.18071
• Published • 320
Why Language Models Hallucinate
Paper
• 2509.04664
• Published • 199
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual
Search
Paper
• 2509.07969
• Published • 59
Visual Representation Alignment for Multimodal Large Language Models
Paper
• 2509.07979
• Published • 84
Detect Anything via Next Point Prediction
Paper
• 2510.12798
• Published • 50
Less is More: Recursive Reasoning with Tiny Networks
Paper
• 2510.04871
• Published • 513
Diffusion Language Models are Super Data Learners
Paper
• 2511.03276
• Published • 132