19 47 58

xiangan

https://anxiangsir.github.io/

anxiangsir

AI & ML interests

None yet

Recent Activity

upvoted a paper about 22 hours ago

LLaDA-o: An Effective and Length-Adaptive Omni Diffusion Model

upvoted a paper about 24 hours ago

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

upvoted an article about 24 hours ago

NEO-unify: Building Native Multimodal Unified Models End to End

View all activity

Organizations

upvoted a paper about 22 hours ago

LLaDA-o: An Effective and Length-Adaptive Omni Diffusion Model

Paper • 2603.01068 • Published 6 days ago • 19

upvoted a paper about 24 hours ago

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Paper • 2510.14979 • Published Oct 16, 2025 • 68

upvoted an article about 24 hours ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

1 day ago

•

updated a dataset 3 days ago

lmms-lab-encoder/Molmo2-VideoPointEval

Updated 3 days ago • 9

published a dataset 3 days ago

lmms-lab-encoder/Molmo2-VideoPointEval

Updated 3 days ago • 9

upvoted a paper 3 days ago

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published 3 days ago • 79

liked a model 4 days ago

Qwen/Qwen3.5-4B

Image-Text-to-Text • 5B • Updated 5 days ago • 233k • 265

upvoted a changelog 5 days ago

Hugging Face Changelog

Public Storage Add-ons

9 days ago

• 114

upvoted a collection 14 days ago

onevision-encoder

Collection

2 items • Updated 25 days ago • 6

published 2 datasets 15 days ago

lmms-lab-encoder/60s_tem_grounding_ov2_codec_100k

Updated 14 days ago • 82

lmms-lab-encoder/60s_20260215_154644_ov2_codec_1w

Updated 15 days ago • 11

upvoted a paper 17 days ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published 22 days ago • 20

authored a paper 17 days ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published 26 days ago • 50

upvoted a paper 18 days ago

CoPE-VideoLM: Codec Primitives For Efficient Video Language Models

Paper • 2602.13191 • Published 21 days ago • 30

updated a collection 19 days ago

OneVision-Encoder

Collection

2 items • Updated 19 days ago

upvoted a paper 19 days ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published 26 days ago • 50

updated a dataset 20 days ago

lmms-lab-encoder/wd_temporal_grounding_frames_max_64_max_448x448_pixels_with_fps

Updated 20 days ago • 153

published a dataset 20 days ago

lmms-lab-encoder/wd_temporal_grounding_frames_max_64_max_448x448_pixels_with_fps

Updated 20 days ago • 153

upvoted a paper 22 days ago

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published 22 days ago • 57

authored a paper 24 days ago

ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

Paper • 2510.18795 • Published Oct 21, 2025 • 11

xiangan

AI & ML interests

Recent Activity

Organizations

xiangan's activity

NEO-unify: Building Native Multimodal Unified Models End to End

Public Storage Add-ons