LMMs-Lab-Speedrun

community

https://www.lmms-lab.com/

AI & ML interests

None defined yet.

Recent Activity

pufanyi authored a paper 25 days ago

Demystifing Video Reasoning

Yuwei-Niu updated a Space about 1 month ago

LMMs-Lab-Speedrun/README

Yuwei-Niu published a Space about 1 month ago

LMMs-Lab-Speedrun/README

View all activity

authored a paper 25 days ago

Demystifing Video Reasoning

Paper • 2603.16870 • Published 27 days ago • 368

updated a Space about 1 month ago

README

published a Space about 1 month ago

README

updated a dataset about 2 months ago

LMMs-Lab-Speedrun/Data_NanoVLM

Viewer • Updated Feb 27 • 1.62M • 32

updated a dataset about 2 months ago

LMMs-Lab-Speedrun/Data_NanoVLM

Viewer • Updated Feb 27 • 1.62M • 32

published a dataset about 2 months ago

LMMs-Lab-Speedrun/Data_NanoVLM

Viewer • Updated Feb 27 • 1.62M • 32

updated a model about 2 months ago

LMMs-Lab-Speedrun/NanoVLM_Init

1B • Updated Feb 23

published a model about 2 months ago

LMMs-Lab-Speedrun/NanoVLM_Init

1B • Updated Feb 23

authored a paper about 2 months ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

authored 2 papers 2 months ago

ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

Paper • 2510.18795 • Published Oct 21, 2025 • 11

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

Paper • 2601.10305 • Published Jan 15 • 36

authored a paper 3 months ago

iFSQ: Improving FSQ for Image Generation with 1 Line of Code

Paper • 2601.17124 • Published Jan 23 • 33

authored a paper 4 months ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 189

authored 2 papers 5 months ago

UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning

Paper • 2510.13515 • Published Oct 15, 2025 • 12

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 96

authored a paper 5 months ago

Scaling Spatial Intelligence with Multimodal Foundation Models

Paper • 2511.13719 • Published Nov 17, 2025 • 48

authored 4 papers 6 months ago

ForCenNet: Foreground-Centric Network for Document Image Rectification

Paper • 2507.19804 • Published Jul 26, 2025 • 12

Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval

Paper • 2509.09118 • Published Sep 11, 2025 • 8

UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning

Paper • 2510.13515 • Published Oct 15, 2025 • 12

ORID: Organ-Regional Information Driven Framework for Radiology Report Generation

Paper • 2411.13025 • Published Nov 20, 2024 • 2