nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-06-16 11:09:09 +00:00

History

geopti 16755495bc fix(miniseries): extract tokens_trained from log instead of hardcoding batch size Same bug as scaling_laws.sh: TOKENS_TRAINED was computed as NUM_ITERS * 524288, hardcoding the default total batch size. When base_train auto-computes a different batch size, the value is wrong. Fix by reading "Total number of training tokens:" directly from the training log.		2026-02-28 20:43:34 +00:00
..
miniseries.sh	fix(miniseries): extract tokens_trained from log instead of hardcoding batch size	2026-02-28 20:43:34 +00:00
runcpu.sh	merge two files base_loss and base_eval into a single file, it's nicer this way, and unify the huggingface code associated with both	2026-02-01 02:36:43 +00:00
scaling_laws.sh	fix: correct CSV extraction in scaling_laws.sh	2026-02-28 16:37:04 +00:00
speedrun.sh	fix comment	2026-02-18 23:26:22 +00:00