- Fix output logit hook in coord check to apply muP scaling (base/width)
- Replace config mutation side effect with assertion in setup_optimizer
- Set mup_base_width at GPTConfig construction in base_train.py
- Remove dead code (_transfer_check_output_mult)
- Tune base LRs to center optimal multiplier near 1.0 (0.12, 6.0, 0.12)
- Use log scale on all loss plots for better low-loss detail
- Add automated muP tests (coord check + transfer check)
- Update muP_changes.md verification commands
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>