geodesic-research/nemotron_nano_sft_warm_start_1150 Text Generation • 32B • Updated 5 days ago • 377 • 1
geodesic-research/nemotron_nano_sft_warm_start_16k Text Generation • 32B • Updated 5 days ago • 302 • 1
geodesic-research/nemotron_nano_sft_warm_start_100k Text Generation • 32B • Updated 5 days ago • 272 • 1
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-Base-BF16 Text Generation • 124B • Updated Mar 14 • 14.7k • 26
(Some) Emergent Misalignment from Reward Hacking in RL Collection Model checkpoints from the project "(Some) Natural Emergent Misalignment from Reward Hacking in Non-Production RL" • 228 items • Updated 15 days ago • 3