wangshuai's picture

Open to Collab

wangshuai

wangsssssss

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation

upvoted a paper 11 days ago

Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens

upvoted a paper 27 days ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

View all activity

Organizations

upvoted a paper 6 days ago

UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation

Paper • 2603.23500 • Published 6 days ago • 35

upvoted a paper 11 days ago

Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens

Paper • 2603.19232 • Published 11 days ago • 33

upvoted a paper 27 days ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 27 days ago • 102

upvoted a paper 29 days ago

Mode Seeking meets Mean Seeking for Fast Long Video Generation

Paper • 2602.24289 • Published Feb 27 • 41

upvoted 5 papers about 1 month ago

veScale-FSDP: Flexible and High-Performance FSDP at Scale

Paper • 2602.22437 • Published Feb 25 • 7

Image Generation with a Sphere Encoder

Paper • 2602.15030 • Published Feb 16 • 16

JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation

Paper • 2602.19163 • Published Feb 22 • 14

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

Paper • 2602.13515 • Published Feb 13 • 44

Unified Latents (UL): How to train your latents

Paper • 2602.17270 • Published Feb 19 • 58

upvoted 5 papers about 2 months ago

Autoregressive Image Generation with Masked Bit Modeling

Paper • 2602.09024 • Published Feb 9 • 7

Adaptive 1D Video Diffusion Autoencoder

Paper • 2602.04220 • Published Feb 4 • 5

PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss

Paper • 2602.02493 • Published Feb 2 • 46

One-step Latent-free Image Generation with Pixel Mean Flows

Paper • 2601.22158 • Published Jan 29 • 18

Revisiting Diffusion Model Predictions Through Dimensionality

Paper • 2601.21419 • Published Jan 29 • 4

upvoted 3 papers 2 months ago

Towards Pixel-Level VLM Perception via Simple Points Prediction

Paper • 2601.19228 • Published Jan 27 • 18

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published Jan 21 • 21

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Paper • 2601.16208 • Published Jan 22 • 55

upvoted 3 papers 3 months ago

BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published Jan 10 • 200

SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published Dec 23, 2025 • 94

Bidirectional Normalizing Flow: From Data to Noise and Back

Paper • 2512.10953 • Published Dec 11, 2025 • 7