4 6 15

Geyang

geyang627

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

Safe and Scalable Web Agent Learning via Recreated Websites

upvoted an article 22 days ago

Deriving the PPO Loss from First Principles

upvoted an article 22 days ago

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

View all activity

Organizations

upvoted a paper 8 days ago

Safe and Scalable Web Agent Learning via Recreated Websites

Paper • 2603.10505 • Published 13 days ago • 25

upvoted 2 articles 22 days ago

Article

Deriving the PPO Loss from First Principles

Dec 25, 2025

•

Article

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

Jan 19

•

New activity in QCRI/MultiNativQA 4 months ago

All is_reliable is True

#2 opened 4 months ago by

geyang627

upvoted a collection 7 months ago

Qwen3

Collection

84 items • Updated Dec 31, 2025 • 1.73k

updated 3 models 9 months ago

updated a collection 9 months ago

CARE

Collection

14 items • Updated Jun 30, 2025 • 2

updated 5 models 9 months ago

geyang627/care-arabic-mistral-7b

7B • Updated Jun 30, 2025 • 1 • 1

geyang627/care-japanese-qwen2.5-7b

8B • Updated Jun 28, 2025 • 11

geyang627/care-japanese-mistral-7b

7B • Updated Jun 28, 2025 • 2 • 1

geyang627/care-japanese-llama3.1-8b

8B • Updated Jun 28, 2025 • 15 • 1

geyang627/care-japanese-gemma2-9b

9B • Updated Jun 28, 2025 • 5 • 1

published 3 models 9 months ago

geyang627/care-japanese-qwen2.5-7b

8B • Updated Jun 28, 2025 • 11

geyang627/care-japanese-mistral-7b

7B • Updated Jun 28, 2025 • 2 • 1

geyang627/care-japanese-llama3.1-8b

8B • Updated Jun 28, 2025 • 15 • 1

Geyang

AI & ML interests

Recent Activity

Organizations

geyang627's activity

Deriving the PPO Loss from First Principles

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

All is_reliable is True