nanochat/tests
Amrit Bulusu 641e8a6dd3 muP implementation: coord check, transfer check, and code quality fixes
- Fix output logit hook in coord check to apply muP scaling (base/width)
- Replace config mutation side effect with assertion in setup_optimizer
- Set mup_base_width at GPTConfig construction in base_train.py
- Remove dead code (_transfer_check_output_mult)
- Tune base LRs to center optimal multiplier near 1.0 (0.12, 6.0, 0.12)
- Use log scale on all loss plots for better low-loss detail
- Add automated muP tests (coord check + transfer check)
- Update muP_changes.md verification commands

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 16:28:50 -04:00
..
test_attention_fallback.py delete autocast, an unnecessary thorn in my side, manage dtypes directly 2026-03-04 23:55:30 +00:00
test_engine.py Fix MockModel's device definition (#535) 2026-02-17 16:03:46 -08:00
test_mup.py muP implementation: coord check, transfer check, and code quality fixes 2026-03-14 16:28:50 -04:00