Liutong Zhou's picture

Liutong Zhou

3642578a

·

AI & ML interests

Multi-Modal LLM

Recent Activity

liked a Space about 1 month ago

nanotron/ultrascale-playbook

liked a Space about 1 month ago

HuggingFaceTB/smol-training-playbook

liked a model 8 months ago

Qwen/Qwen3-235B-A22B-Thinking-2507

View all activity

Organizations

liked 2 Spaces about 1 month ago

The Ultra-Scale Playbook

The ultimate guide to training LLM on large GPU Clusters

The Smol Training Playbook

The secrets to building world-class LLMs

liked 2 models 8 months ago

Qwen/Qwen3-235B-A22B-Thinking-2507

Text Generation • Updated Aug 17, 2025 • 62.5k • • 401

Qwen/Qwen3-235B-A22B-Instruct-2507

Text Generation • Updated Sep 17, 2025 • 177k • • 768

upvoted a collection 8 months ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.72k

liked a model 8 months ago

moonshotai/Kimi-K2-Instruct

Text Generation • 1T • Updated Jan 30 • 95.4k • • 2.33k

liked a model 9 months ago

google/gemma-3n-E4B-it

Image-Text-to-Text • Updated Jul 14, 2025 • 64.7k • • 886

liked 4 models over 1 year ago

deepseek-ai/DeepSeek-V2.5

Text Generation • 236B • Updated Dec 11, 2024 • 7.22k • 732

deepseek-ai/DeepSeek-V2-Chat-0628

Text Generation • 236B • Updated Jul 18, 2024 • 3.51k • 177

Qwen/Qwen2.5-72B-Instruct

Text Generation • 73B • Updated Jan 12, 2025 • 772k • • 920

apple/DCLM-7B-8k

7B • Updated Aug 6, 2024 • 7 • 44

upvoted a collection over 1 year ago

DCLM

DCLM Models + Datasets • 6 items • Updated Aug 25, 2025 • 27

liked 2 models almost 2 years ago

microsoft/Phi-3-mini-4k-instruct

Text Generation • Updated Dec 10, 2025 • 903k • 1.4k

nvidia/Nemotron-4-340B-Base

Updated Jun 28, 2024 • 891 • 147

upvoted a collection almost 2 years ago

Phi-3

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 25 items • Updated 20 days ago • 577

liked a model almost 2 years ago

google/flan-t5-xxl

11B • Updated Jul 27, 2023 • 18.4k • 1.28k

upvoted a paper almost 2 years ago

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

Paper • 2301.13688 • Published Jan 31, 2023 • 10

upvoted a collection almost 2 years ago

Flan-T5 release

The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated 10 days ago • 33

liked 2 models about 2 years ago

internlm/internlm2-chat-20b-sft

Text Generation • 20B • Updated Aug 20, 2024 • 56 • 12

01-ai/Yi-34B-Chat

Text Generation • Updated Nov 11, 2024 • 31.6k • 357