10 37 59

Jiaming Han

csuhan

https://csuhan.com

csuhan

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 4 days ago

Gen-Searcher: Reinforcing Agentic Search for Image Generation

upvoted a paper 15 days ago

Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens

upvoted a collection about 1 month ago

BitDance

View all activity

Organizations

None yet

upvoted a paper 4 days ago

Gen-Searcher: Reinforcing Agentic Search for Image Generation

Paper • 2603.28767 • Published 4 days ago • 53

upvoted a paper 15 days ago

Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens

Paper • 2603.19232 • Published 15 days ago • 33

upvoted a collection about 1 month ago

BitDance

Collection

BitDance: Open-source autoregressive model with binary visual tokens. A research project for building powerful multimodal autoregressive model. • 10 items • Updated Mar 2 • 11

upvoted 2 papers about 1 month ago

UniWeTok: An Unified Binary Tokenizer with Codebook Size 2^{128} for Unified Multimodal Large Language Model

Paper • 2602.14178 • Published Feb 15 • 14

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Paper • 2602.14041 • Published Feb 15 • 53

upvoted a paper 3 months ago

Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

Paper • 2512.17909 • Published Dec 19, 2025 • 37

upvoted 4 papers 4 months ago

Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans?

Paper • 2512.13281 • Published Dec 15, 2025 • 65

OneThinker: All-in-one Reasoning Model for Image and Video

Paper • 2512.03043 • Published Dec 2, 2025 • 34

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 245

The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation

Paper • 2511.20256 • Published Nov 25, 2025 • 28

upvoted a paper 6 months ago

VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning

Paper • 2510.08555 • Published Oct 9, 2025 • 65

upvoted a collection 6 months ago

Qwen3-Omni

Collection

6 items • Updated Dec 31, 2025 • 193

upvoted 2 papers 7 months ago

FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

Paper • 2509.09680 • Published Sep 11, 2025 • 44

Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing

Paper • 2509.01984 • Published Sep 2, 2025 • 7

upvoted an article 8 months ago

Article

"Diffusers Image Fill" guide

Sep 13, 2024

•

upvoted 2 papers 8 months ago

Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation

Paper • 2508.05635 • Published Aug 7, 2025 • 73

ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents

Paper • 2507.22827 • Published Jul 30, 2025 • 101

upvoted a collection 9 months ago

OmniCorpus

Collection

A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text • 6 items • Updated Sep 28, 2025 • 4

upvoted a paper 9 months ago

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Paper • 2506.18898 • Published Jun 23, 2025 • 34

upvoted a paper 10 months ago

Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation

Paper • 2506.09350 • Published Jun 11, 2025 • 48

Jiaming Han

AI & ML interests

Recent Activity

Organizations

csuhan's activity

"Diffusers Image Fill" guide