Yutao Zeng's picture

Yutao Zeng

Taoer

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

InCoder-32B: Code Foundation Model for Industrial Scenarios

authored a paper 4 months ago

Virtual Width Networks

upvoted a paper 4 months ago

Virtual Width Networks

View all activity

Organizations

upvoted a paper 6 days ago

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published 7 days ago • 296

authored a paper 4 months ago

Virtual Width Networks

Paper • 2511.11238 • Published Nov 14, 2025 • 38

upvoted a paper 4 months ago

Virtual Width Networks

Paper • 2511.11238 • Published Nov 14, 2025 • 38

authored a paper 7 months ago

UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning

Paper • 2508.18756 • Published Aug 26, 2025 • 36

updated a collection 7 months ago

Full Paper List

11 items • Updated Aug 27, 2025 • 1

upvoted a paper 7 months ago

UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning

Paper • 2508.18756 • Published Aug 26, 2025 • 36

updated a collection 9 months ago

Full Paper List

11 items • Updated Aug 27, 2025 • 1

authored a paper 10 months ago

Stepsize anything: A unified learning rate schedule for budgeted-iteration training

Paper • 2505.24452 • Published May 30, 2025 • 5

upvoted a paper 10 months ago

Stepsize anything: A unified learning rate schedule for budgeted-iteration training

Paper • 2505.24452 • Published May 30, 2025 • 5

commented a paper 10 months ago

Stepsize anything: A unified learning rate schedule for budgeted-iteration training

Paper • 2505.24452 • Published May 30, 2025 • 5 •

authored 2 papers 10 months ago

Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

Paper • 2504.13914 • Published Apr 10, 2025 • 5

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20, 2025 • 76

updated a collection 10 months ago

Full Paper List

11 items • Updated Aug 27, 2025 • 1

upvoted a paper 10 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20, 2025 • 76

authored a paper 11 months ago

Efficient Pretraining Length Scaling

Paper • 2504.14992 • Published Apr 21, 2025 • 20

upvoted a paper 11 months ago

Efficient Pretraining Length Scaling

Paper • 2504.14992 • Published Apr 21, 2025 • 20

updated a collection 11 months ago

Full Paper List

11 items • Updated Aug 27, 2025 • 1

updated 2 models 12 months ago

Open-Foundation-Models/PolyNorm_1B

Text Generation • Updated Apr 8, 2025 • 7

Open-Foundation-Models/PolyReLU_1B

Text Generation • Updated Apr 8, 2025 • 7

upvoted a paper about 1 year ago

Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts

Paper • 2503.16057 • Published Mar 20, 2025 • 14