Taeho Hwang's picture

2 9 2

Taeho Hwang

doubleyyh

·

ThisIsHwang

AI & ML interests

None yet

Recent Activity

reacted to sergiopaniego's post with 🚀 2 days ago

TRL v0.27.0 is out!! 🥳 It includes GDPO, the latest variant of GRPO for multi-reward RL ✨ GDPO decouples reward normalization to avoid reward collapse and improve per-reward convergence — developed by @sliuau @SimonX et al. Explore the paper: https://huggingface.co/papers/2601.05242 Explore the full set of changes here: https://github.com/huggingface/trl/releases/tag/v0.27.0

liked a Space 13 days ago

SamsungResearch/TRUEBench

upvoted a paper 3 months ago

Adaptive Multi-Agent Response Refinement in Conversational Systems

View all activity

Organizations

doubleyyh 's datasets

None public yet