Reasoning models trained on synthetic data using reinforcement learning.
Yichao 'Peak' Ji
peakji
AI & ML interests
Agents, Small Language Models, Retrieval-Augmented Generation, Information Extraction
Recent Activity
liked
a dataset 3 days ago
nvidia/Nemotron-Agentic-v1 liked
a model 11 days ago
Qwen/Qwen3.5-35B-A3B-Base liked
a model 16 days ago
Qwen/Qwen3.5-397B-A17B