vision
updated
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D
Worlds from Words or Pixels
Paper
• 2507.21809
• Published • 141
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and
Structural Cohesion
Paper
• 2507.06165
• Published • 60
Paper
• 2508.10104
• Published • 300
Qwen-Image Technical Report
Paper
• 2508.02324
• Published • 273
Visual-CoG: Stage-Aware Reinforcement Learning with Chain of Guidance
for Text-to-Image Generation
Paper
• 2508.18032
• Published • 41
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion
Transformers
Paper
• 2410.10629
• Published • 12
Masked Autoencoders Are Effective Tokenizers for Diffusion Models
Paper
• 2502.03444
• Published
Seedream 3.0 Technical Report
Paper
• 2504.11346
• Published • 70
DanceGRPO: Unleashing GRPO on Visual Generation
Paper
• 2505.07818
• Published • 32
UMO: Scaling Multi-Identity Consistency for Image Customization via
Matching Reward
Paper
• 2509.06818
• Published • 29
Instruct-Imagen: Image Generation with Multi-modal Instruction
Paper
• 2401.01952
• Published • 31
Kontinuous Kontext: Continuous Strength Control for Instruction-based
Image Editing
Paper
• 2510.08532
• Published • 6
Diffusion Transformers with Representation Autoencoders
Paper
• 2510.11690
• Published • 170