Kyu Song's picture

Kyu Song

kyunocap

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering

upvoted a paper 4 days ago

Attention Residuals

upvoted a paper 4 days ago

Mixture-of-Depths Attention

View all activity

Organizations

None yet

upvoted 4 papers 4 days ago

GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering

Paper • 2603.15616 • Published 5 days ago • 5

Attention Residuals

Paper • 2603.15031 • Published 5 days ago • 131

Mixture-of-Depths Attention

Paper • 2603.15619 • Published 5 days ago • 73

Grounding World Simulation Models in a Real-World Metropolis

Paper • 2603.15583 • Published 5 days ago • 138

upvoted a paper 5 days ago

OmniForcing: Unleashing Real-time Joint Audio-Visual Generation

Paper • 2603.11647 • Published 9 days ago • 31

upvoted a paper 8 days ago

ShotVerse: Advancing Cinematic Camera Control for Text-Driven Multi-Shot Video Creation

Paper • 2603.11421 • Published 10 days ago • 34

upvoted 3 papers 9 days ago

ID-LoRA: Identity-Driven Audio-Video Personalization with In-Context LoRA

Paper • 2603.10256 • Published 11 days ago • 19

COMIC: Agentic Sketch Comedy Generation

Paper • 2603.11048 • Published 10 days ago • 4

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published 11 days ago • 134

liked a model 22 days ago

Qwen/Qwen3.5-397B-A17B

Image-Text-to-Text • 403B • Updated 7 days ago • 1.8M • • 1.37k

upvoted a paper 22 days ago

The Trinity of Consistency as a Defining Principle for General World Models

Paper • 2602.23152 • Published 23 days ago • 198

upvoted a paper 25 days ago

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published 26 days ago • 516

upvoted 4 papers 29 days ago

DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos

Paper • 2602.06949 • Published Feb 6 • 36

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

Paper • 2602.13515 • Published Feb 13 • 44

Unified Latents (UL): How to train your latents

Paper • 2602.17270 • Published about 1 month ago • 58

DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers

Paper • 2602.16968 • Published about 1 month ago • 12

upvoted 3 papers about 1 month ago

jina-embeddings-v5-text: Task-Targeted Embedding Distillation

Paper • 2602.15547 • Published Feb 17 • 26

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published Feb 12 • 20

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

Paper • 2602.12670 • Published Feb 13 • 56

liked a model about 1 month ago

yaolily/TimeChat-Captioner-GRPO-7B

Video-Text-to-Text • 9B • Updated Feb 11 • 260 • 2