Commit History

Add HuggingFace Hub checkpoint persistence - upload and download checkpoints between jobs
3b46388

renpas22 commited on

Restore SFT to 10000 steps - will resume from checkpoint at 7000
e9e0301

renpas22 commited on

Add checkpoint resumption - automatically resume from latest checkpoint
5419afd

renpas22 commited on

Reduce training steps to fit within job timeout (SFT: 5k, PRM: 2k, RL: 3k)
5cfd8d6

renpas22 commited on

Fix ReasoningStep attribute - use description not content
f941008

renpas22 commited on

Fix SPECIAL_TOKENS usage - import at module level and use string literals
464ac9b

renpas22 commited on

Fix ReasoningChain dataclass - add image field and defaults, fix collate function
3024a91

renpas22 commited on

Fix None image handling in collate function
d29f3e7

renpas22 commited on

Convert learning_rate to float explicitly
9e7779a

renpas22 commited on

Remove dead code with direct config access
ccd696b

renpas22 commited on

Add getattr defaults for all config parameters
cd76323

renpas22 commited on

Fix FineVision dataset loading with subset parameter
e47ae2c

renpas22 commited on

Implement full SFT, PRM, and RL training with dataset loading
84a183c

renpas22 commited on

Make train_prm and train_rl placeholders - dataset loading needs HF integration
7bff7cb

renpas22 commited on

Force fresh repository download with cache clearing
714d05d

renpas22 commited on

Add **kwargs to train_prm and train_rl to accept config parameters
917e40e

renpas22 commited on

Force download latest revision from HF repo
f8fc68a

renpas22 commited on

Fix train_prm and train_rl signatures to accept max_steps and learning_rate
41bcc92

renpas22 commited on

Add placeholder train_sft method
85ab8c2

renpas22 commited on

Fix quantized model device handling in inference_scaling
a745b26

renpas22 commited on

Add type conversion for RLConfig parameters
f15c2d7

renpas22 commited on

Add type conversion and debug logging for config values
c74a578

renpas22 commited on

Quote mixed_precision value
37e8f2f

renpas22 commited on

Add inference config parameters
2d1ba1a

renpas22 commited on

Skip .to(device) for quantized models with device_map
fa9e543

renpas22 commited on

Add missing RL/PPO config parameters
5af9eca

renpas22 commited on

Fix gradient checkpointing for VLM models
0326431

renpas22 commited on

Fix tokenize() calls to use actual tokenizer
8268436

renpas22 commited on

Fix working directory for HF Jobs environment
bb7ed44

renpas22 commited on

Fix len() calls to use actual_tokenizer instead of processor
4605c1b

renpas22 commited on

Fix tokenizer access for Processor objects
b8bd3e8

renpas22 commited on

Add VLM support to trainer with auto-detection
e20135f

renpas22 commited on

Fix model name to use valid Qwen2-VL model
61dbc34

renpas22 commited on

Enable 8-bit quantization and reduce batch size for memory
d4b8544

renpas22 commited on

Add required top-level config keys for trainer
9770aa1

renpas22 commited on

Fix config path resolution for trainer initialization
ec4cb07

renpas22 commited on

Fix function call arguments and trainer initialization
879210b

renpas22 commited on

Fix config path resolution for HF Jobs
82de57b

renpas22 commited on

Add utils directory
da76488

renpas22 commited on

Auto-download repository in HF Jobs environment
487225b

renpas22 commited on

Fix imports to work with sys.path
83e0535

renpas22 commited on

Fix Python path for imports
27b06f0

renpas22 commited on

Add training scripts and configs
2b8876a

renpas22 commited on

initial commit
cd336ff
verified

Mulebot commited on