-
GLM-5: from Vibe Coding to Agentic Engineering
Paper • 2602.15763 • Published • 115 -
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper • 2602.16742 • Published • 12 -
From Perception to Action: An Interactive Benchmark for Vision Reasoning
Paper • 2602.21015 • Published • 23 -
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Paper • 2603.09906 • Published • 71
tuanzi
e-tuanzi
AI & ML interests
None yet
Recent Activity
updated a collection 8 days ago
260313 updated a collection 8 days ago
260313 updated a collection 8 days ago
260313Organizations
None yet
3d
light
-
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
Paper • 2512.24618 • Published • 152 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66 -
Yume-1.5: A Text-Controlled Interactive World Generation Model
Paper • 2512.22096 • Published • 61 -
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion
Paper • 2512.23709 • Published • 51
game
multimodal
agent
-
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
Paper • 2512.24873 • Published • 108 -
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents
Paper • 2512.23343 • Published • 29 -
BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts
Paper • 2512.24885 • Published • 5 -
An Information Theoretic Perspective on Agentic System Design
Paper • 2512.21720 • Published • 8
video
-
Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation
Paper • 2512.24271 • Published • 64 -
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation
Paper • 2512.24724 • Published • 8 -
Pretraining Frame Preservation in Autoregressive Video Memory Compression
Paper • 2512.23851 • Published • 25 -
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation
Paper • 2512.24551 • Published • 21
260313
-
GLM-5: from Vibe Coding to Agentic Engineering
Paper • 2602.15763 • Published • 115 -
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper • 2602.16742 • Published • 12 -
From Perception to Action: An Interactive Benchmark for Vision Reasoning
Paper • 2602.21015 • Published • 23 -
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Paper • 2603.09906 • Published • 71
multimodal
3d
agent
-
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
Paper • 2512.24873 • Published • 108 -
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents
Paper • 2512.23343 • Published • 29 -
BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts
Paper • 2512.24885 • Published • 5 -
An Information Theoretic Perspective on Agentic System Design
Paper • 2512.21720 • Published • 8
light
-
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
Paper • 2512.24618 • Published • 152 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66 -
Yume-1.5: A Text-Controlled Interactive World Generation Model
Paper • 2512.22096 • Published • 61 -
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion
Paper • 2512.23709 • Published • 51
video
-
Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation
Paper • 2512.24271 • Published • 64 -
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation
Paper • 2512.24724 • Published • 8 -
Pretraining Frame Preservation in Autoregressive Video Memory Compression
Paper • 2512.23851 • Published • 25 -
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation
Paper • 2512.24551 • Published • 21
game