Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
3
8
6
SII-Wenhong
wh-zhu
Follow
SII-xrliu's profile picture
1 follower
·
2 following
zwhong714
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
8 days ago
Hybrid Policy Distillation for LLMs
submitted
a paper
8 days ago
Hybrid Policy Distillation for LLMs
new
activity
9 days ago
wh-zhu/Qwen2.5-7B-PSFT-RL-DAPO-90:
Add model card and metadata
View all activity
Organizations
wh-zhu
's models
58
Sort: Recently updated
wh-zhu/Qwen2.5-7B-Instruct-SFT-700
8B
•
Updated
Jul 26, 2025
•
1
wh-zhu/Qwen2.5-7B-Instruct-PSFT-1300
8B
•
Updated
Jul 26, 2025
•
3
wh-zhu/Llama-3.1-8B-Instruc-SFT-KL-DeepMath-3epoch
8B
•
Updated
Jul 23, 2025
•
1
wh-zhu/Qwen3-4B-Base-PSFT
4B
•
Updated
Jul 22, 2025
•
1
wh-zhu/Llama-3.1-8B-Instruct-SFT-DeepMath-2epoch
8B
•
Updated
Jul 20, 2025
•
1
wh-zhu/Llama-3.1-8B-Instruct-PSFT-DeepMath
8B
•
Updated
Jul 20, 2025
wh-zhu/Qwen2.5-7B-Instruct-PSFT-DeepMath-4epoch
8B
•
Updated
Jul 18, 2025
•
1
wh-zhu/Qwen2.5-7B-Instruct-SFT-DeepMath-1epoch
8B
•
Updated
Jul 17, 2025
•
1
wh-zhu/Qwen2.5-7B-Instruct-PSFT-DeepMath
8B
•
Updated
Jul 17, 2025
•
1
wh-zhu/Qwen2.5-7B-Instruct-SFT-step150
8B
•
Updated
Jul 17, 2025
•
1
wh-zhu/Qwen2.5-7B-Instruct-SFT
8B
•
Updated
Jul 16, 2025
•
2
wh-zhu/LLama3.1-8B-Instruct-PSFT
8B
•
Updated
Jul 16, 2025
•
1
wh-zhu/Qwen2.5-7B-Instruct-PSFT
8B
•
Updated
Jul 16, 2025
•
1
wh-zhu/DeepSeek-R1-TrRa-1.5B-lambda_2
2B
•
Updated
Jun 17, 2025
•
1
wh-zhu/DeepSeek-R1-TrRa-1.5B-lambda_5
2B
•
Updated
Jun 17, 2025
•
2
wh-zhu/DeepSeek-R1-TrRa-1.5B-lambda_10
2B
•
Updated
Jun 17, 2025
•
5
wh-zhu/DeepSeek-R1-TrRa-iter2-1.5B-lambda_2
2B
•
Updated
Jun 17, 2025
•
1
wh-zhu/DeepSeek-R1-TrRa-iter1-1.5B-lambda_2
2B
•
Updated
Jun 17, 2025
•
1
wh-zhu/DeepSeek-R1-TrRa-1.5B_lambda_1.5
2B
•
Updated
Jun 17, 2025
•
2
wh-zhu/DeepSeek-R1-TrRa-1.5B_lambda_0.5
2B
•
Updated
Jun 17, 2025
•
2
wh-zhu/DeepScaleR-7B-WSPO
8B
•
Updated
Jun 10, 2025
•
3
wh-zhu/qwen2_7B-ultrachatfeedback-wspo
8B
•
Updated
Jun 10, 2025
•
1
wh-zhu/qwen2_1.5B-ultrachatfeedback-dpo
2B
•
Updated
Jun 10, 2025
•
65
wh-zhu/qwen2_7B-ultrachat200k
8B
•
Updated
Jun 10, 2025
•
188
wh-zhu/qwen2_1.5B-ultrachat200k
2B
•
Updated
Jun 10, 2025
•
44
wh-zhu/OpenMath-nemotron-7B-WSPO
8B
•
Updated
May 25, 2025
•
1
wh-zhu/DeepSeek-R1-Distill-Qwen-7B-InRa
8B
•
Updated
May 7, 2025
•
2
wh-zhu/DeepSeek-R1-Distill-Qwen-1.5B-InRa
Text Generation
•
2B
•
Updated
May 7, 2025
•
3
Previous
1
2
Next