PSFT+RL models
SII-Wenhong
wh-zhu
AI & ML interests
None yet
Recent Activity
updated
a dataset 10 days ago
wh-zhu/dapo published
a dataset 10 days ago
wh-zhu/dapo liked
a dataset 3 months ago
zai-org/SWE-Dev-train Organizations
models 57
wh-zhu/Qwen2.5-7B-Instruct-SFT-lr-5e6
8B • Updated
wh-zhu/Qwen2.5-7B-Instruct-16-1300
8B • Updated
wh-zhu/Qwen2.5-7B-Instruct-ref-1300
8B • Updated
wh-zhu/Qwen2.5-7B-Instruct-update4-600
8B • Updated
wh-zhu/Qwen2.5-7B-Instruct-VL-SFT-RL120
8B • Updated
• 3
wh-zhu/Qwen2.5-7B-Instruct-VL-SFT-RL165
8B • Updated
• 6
wh-zhu/Qwen2.5-7B-Instruct-VL-PSFT-RL165
8B • Updated
• 3
wh-zhu/Qwen2.5-7B-Instruct-VL-ORI-RL140
8B • Updated
• 1
wh-zhu/Qwen2.5-7B-Instruct-edit-ruilin400
8B • Updated
wh-zhu/Qwen2.5-7B-Instruct-VL-RL100
8B • Updated
datasets 6
wh-zhu/dapo
Viewer
• Updated
• 17.4k • 10
wh-zhu/train_openr1_8k
Viewer
• Updated
• 45.8k • 6
wh-zhu/aime-24
Viewer
• Updated
• 960 • 12
wh-zhu/train_openr1_4k
Viewer
• Updated
• 25.4k • 14 • 2
wh-zhu/short_cot_calibration
Viewer
• Updated
• 52.8k • 6
wh-zhu/long_cot_calibration
Viewer
• Updated
• 35k • 7