nanochat/runs
geopti fb2be07e17 fix: correct CSV extraction in scaling_laws.sh
Two bugs caused all parameter columns and tokens_trained to be silently
empty/wrong in the results CSV:

1. Parameter grep patterns did not account for the padded key format.
   base_train.py prints parameters as `{key:24s}: {value:,}`, e.g.
   `wte                     : 33,554,432`, so patterns like `grep "wte:"`
   never matched. Fixed by using `grep -P "wte\s+:"` to handle the spaces.

2. tokens_trained was hardcoded as `NUM_ITERS * 524288`, but the batch
   size is auto-computed by base_train.py and may differ from 524288
   depending on the FLOPs budget and model size. Fixed by extracting the
   actual value from the log line "Total number of training tokens: X".
2026-02-28 16:37:04 +00:00
..
miniseries.sh at 28 and above we start to need batch size 8 2026-02-08 18:26:34 +00:00
runcpu.sh merge two files base_loss and base_eval into a single file, it's nicer this way, and unify the huggingface code associated with both 2026-02-01 02:36:43 +00:00
scaling_laws.sh fix: correct CSV extraction in scaling_laws.sh 2026-02-28 16:37:04 +00:00
speedrun.sh fix comment 2026-02-18 23:26:22 +00:00