Submitted by flavoredquark 71 ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning · 10 authors 5
Submitted by hba123 69 Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level · 18 authors 6
Submitted by songdj 49 Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination · 5 authors 18 2
Submitted by Akeeper 28 Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models · 6 authors 18 1
Submitted by WenhaoWang 27 TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation · 2 authors 39 2
Submitted by naotous 10 From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond · 7 authors 1