Wanwei He's picture

Wanwei He

Grocery

·

https://scholar.google.com/citations?hl=zh-CN&user=NNfxnUYAAAAJ

HwwAncient

AI & ML interests

LLM

Recent Activity

upvoted a paper 2 days ago

Learning Ordinal Probabilistic Reward from Preferences

liked a model 12 days ago

Qwen/Qwen3.5-35B-A3B

commented on a paper 6 months ago

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

View all activity

Organizations

liked a model 12 days ago

Qwen/Qwen3.5-35B-A3B

Image-Text-to-Text • 36B • Updated 11 days ago • 1.27M • • 1.07k

liked a dataset 7 months ago

llm-blender/Unified-Feedback

Viewer • Updated Mar 31, 2024 • 1.79M • 624 • 18

liked a dataset 9 months ago

allenai/reward-bench-2

Viewer • Updated Jun 4, 2025 • 1.87k • 3.36k • 30

liked a dataset about 1 year ago

xiushenghuang/open_r1_dataset

Viewer • Updated Feb 26, 2025 • 2.59M • 227 • 5

liked a model about 1 year ago

perplexity-ai/r1-1776

Text Generation • Updated Feb 26, 2025 • 961 • 2.34k

liked a Space over 1 year ago

Reward Bench Leaderboard

Explore RewardBench model rankings and scores

liked 4 datasets over 2 years ago

bigcode/the-stack-dedup

Viewer • Updated Aug 17, 2023 • 237M • 9.17k • 384

BAAI/COIG-PC

Viewer • Updated Jun 14, 2024 • 540M • 90 • 271

OpenAssistant/oasst1

Viewer • Updated May 2, 2023 • 88.8k • 10.2k • 1.48k

RyokoAI/ShareGPT52K

Preview • Updated Apr 2, 2023 • 703 • 355