Commit History
Add MoE uneven shard test with mixed expert and non-expert params [skip-build] bdada12
Add uneven shard correctness test [skip-build] 1a97671
Add optimization docs and update implementation guide [skip-build] 14040eb
Update tests for MoE and parallel optimizations [skip-build] 81f49fe
Muon optimizer: expert batching, parallel caching, A2A overlap [skip-build] 0f37d63
Optimize pipeline: batched update, zero-copy scatter, prelaunch gather [skip-build] 2816b64
Cache AdamW placement grouping and tensor lists [skip-build] 8ca2492
Add torch.compile, CUDA graph, and compiled momentum [skip-build] e74d98f
Apply suggestions from code review cdaaf4f
TaehyunKim Copilot commited on
Add mhc_attn, mhc_ffn, lambda_proj to skip_keys ba293d0
Remove verbose param_groups summary logging 24f0957
Support multi-component expert_keys (e.g. "experts.w1") 5a99e12
Extract is_expert_param() helper to consolidate expert key matching e615b1c
Include original (pre-normalize) FQN in is_muon logging 135fc66
Add info-level logging for param group classification (Muon vs AdamW) 1118752
Use component-level matching for expert_keys to avoid shared_experts collision f008017
Normalize parameter FQNs to handle torch.compile / checkpoint wrappers 95a620f
Merge pull request #17 from MotifTechnologies/optimal-ns-coefficients b220459 unverified
Apply pre-commit formatting (yapf) [skip-build] bf30b9b
Add max_iter cap and non-finite checks to _optimal_quintic [skip-build] 206b280
Apply pre-commit formatting (yapf, isort) [skip-build] aff01db
Add comment explaining _coeffs_list and Polar Express vs former NS [skip-build] abaa449
Replace hardcoded NS coefficients with analytically optimal ones [skip-build] 573242f
Refactor pipeline to async generator pattern (#16) 33929c0 unverified
Support mHC (#15) ae32572 unverified
Update arxiv URL fa059da
Support param group with various placements (#13) e2b41e5 unverified
Merge pull request #14 from MotifTechnologies/fix_bug_in_fsdp 5458c82 unverified
TaehyunKim commited on
Add built binary [skip-build] 6ec5093
github-actions[bot] commited on
fix bug in fsdp 811726c
feat(workflow): add Slack notifications for build start, success, and failure [skip-build] (#12) 0b8d958 unverified
Merge pull request #11 from MotifTechnologies/ca1207-patch-1 53deea3 unverified
TaehyunKim commited on
Add built binary [skip-build] de5bead
github-actions[bot] commited on
Update torch-ext/optimizer/muon.py b0230e7 unverified
TaehyunKim commited on
Update torch-ext/optimizer/muon.py ff2fcfb unverified
TaehyunKim commited on
Update muon.py c16b438 unverified
TaehyunKim commited on
Merge pull request #10 from MotifTechnologies/fix_a2a_gs_assert 4f71bc9 unverified
TaehyunKim commited on
Add built binary [skip-build] aee4dc0
github-actions[bot] commited on
fix assert in a2a gather scatter 3dafb3e
fix: use [skip-build] keyword to prevent build [skip-build] 94799ac
Merge pull request #9 from MotifTechnologies/all2all_gather_scatter 1a3da4d unverified
TaehyunKim commited on
Add built binary [ci skip] e93bd1e
github-actions[bot] commited on
delete state in split_func 15336dc
Add built binary [ci skip] 678578a
github-actions[bot] commited on
change owner_params to owned_params 6943c45
modify pre step (overlap step) can get from arsgs 589b763
add doc strings + init self rank on init_assign_params 267e8a0
Add built binary [ci skip] 99af74f
github-actions[bot] commited on