arxiv:2501.03895
Yang Feng
fengyang0317
·
AI & ML interests
None yet
Organizations
None yet
models 10
fengyang0317/sft_output
Updated
fengyang0317/SmolLM2-FT-DPO
Text Generation • 0.1B • Updated
fengyang0317/SmolLM2-FT-MyDataset
Text Generation • 0.1B • Updated
fengyang0317/ppo-CartPole-v1
Reinforcement Learning • Updated
fengyang0317/unit4
Updated
fengyang0317/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning • Updated
fengyang0317/Taxi-v3
Reinforcement Learning • Updated
fengyang0317/q-FrozenLake-v1-4x4-noSlippery
Reinforcement Learning • Updated
fengyang0317/ppo-Huggy
Reinforcement Learning • Updated • 27
fengyang0317/whisper-small-dv
Automatic Speech Recognition • 0.2B • Updated
datasets 10
fengyang0317/commonsense
Viewer • Updated • 10.6k • 8
fengyang0317/prosqa
Viewer • Updated • 18.7k • 19
fengyang0317/prontoqa
Viewer • Updated • 10k • 9
fengyang0317/gsm8k
Viewer • Updated • 387k • 6
fengyang0317/listops-32
Viewer • Updated • 100k • 11
fengyang0317/listops-64
Viewer • Updated • 100k • 97
fengyang0317/listops-128
Viewer • Updated • 100k • 48
fengyang0317/listops-d20
Viewer • Updated • 100k • 13
fengyang0317/listops-1000
Viewer • Updated • 100k • 27
fengyang0317/imagenet-1k
Viewer • Updated • 22 • 24