mirror of
https://github.com/karpathy/nanochat.git
synced 2026-03-26 06:35:15 +00:00
Same bug as scaling_laws.sh: TOKENS_TRAINED was computed as NUM_ITERS * 524288, hardcoding the default total batch size. When base_train auto-computes a different batch size, the value is wrong. Fix by reading "Total number of training tokens:" directly from the training log. |
||
|---|---|---|
| .. | ||
| miniseries.sh | ||
| runcpu.sh | ||
| scaling_laws.sh | ||
| speedrun.sh | ||