FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling Paper • 2604.06916 • Published 3 days ago • 12
M2RNN Collection Note that the 7B models are MoE with 1.1B active parameters and 400M models are dense models • 14 items • Updated 29 days ago • 7
K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model Paper • 2602.19128 • Published Feb 22 • 7
VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents Paper • 2601.16973 • Published Jan 23 • 40
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24, 2025 • 63
UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity Paper • 2511.13714 • Published Nov 17, 2025 • 12