Add HuggingFace Hub checkpoint persistence - upload and download checkpoints between jobs 3b46388 renpas22 commited on Dec 18, 2025
Add checkpoint resumption - automatically resume from latest checkpoint 5419afd renpas22 commited on Dec 18, 2025
Fix SPECIAL_TOKENS usage - import at module level and use string literals 464ac9b renpas22 commited on Dec 18, 2025
Fix ReasoningChain dataclass - add image field and defaults, fix collate function 3024a91 renpas22 commited on Dec 18, 2025
Implement full SFT, PRM, and RL training with dataset loading 84a183c renpas22 commited on Dec 18, 2025
Make train_prm and train_rl placeholders - dataset loading needs HF integration 7bff7cb renpas22 commited on Dec 18, 2025
Add **kwargs to train_prm and train_rl to accept config parameters 917e40e renpas22 commited on Dec 18, 2025
Fix train_prm and train_rl signatures to accept max_steps and learning_rate 41bcc92 renpas22 commited on Dec 18, 2025
Fix len() calls to use actual_tokenizer instead of processor 4605c1b renpas22 commited on Dec 12, 2025