mirror of
https://github.com/karpathy/nanochat.git
synced 2026-04-02 13:45:21 +00:00
- Fix output logit hook in coord check to apply muP scaling (base/width) - Replace config mutation side effect with assertion in setup_optimizer - Set mup_base_width at GPTConfig construction in base_train.py - Remove dead code (_transfer_check_output_mult) - Tune base LRs to center optimal multiplier near 1.0 (0.12, 6.0, 0.12) - Use log scale on all loss plots for better low-loss detail - Add automated muP tests (coord check + transfer check) - Update muP_changes.md verification commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| base_eval.py | ||
| base_train.py | ||
| chat_cli.py | ||
| chat_eval.py | ||
| chat_rl.py | ||
| chat_sft.py | ||
| chat_web.py | ||
| mup_coord_check.py | ||
| mup_transfer_check.py | ||
| tok_eval.py | ||
| tok_train.py | ||