weiyao_ruc
weiweiruc
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 11 hours ago
Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation
upvoted
a
paper
14 days ago
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security
upvoted
a
paper
4 months ago
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding
Organizations
None yet