Commit Graph

4 Commits

Author SHA1 Message Date
Jin Xu
00932d1955 Run 4: d26 0.5M batch, ratio 7.25 — 2.49h (9.6% faster)
Revert d26 batch size from 1M to 0.5M and lower param-data ratio from
8.25 to 7.25. In the speedrun's undertraining regime, smaller batch with
more optimization steps (12,700 vs 7,226) is more efficient than larger
batch with fewer steps.

Result: CORE 0.2626, time 8967s (2.49h), val_bpb 0.750008
Reproduced: CORE 0.2729/0.2626 across two runs, both pass.

AI disclosure: experimental design and hyperparameter search were
conducted using Claude Code.
2026-02-08 23:05:13 +00:00
Andrej Karpathy
5fdd5cdb24 new leaderboard record via new auto-calculated optimal batch size. for d26 it is 1M, up from 0.5M that was default earlier 2026-02-05 20:11:32 +00:00
Sofie Van Landeghem
012da1a78b
Typo fixes (#480)
* small typo

* few more small fixes

* small fixes in leaderboard.md
2026-02-05 19:12:50 +01:00
Andrej Karpathy
16b8ac7da3 oops forgot to attach leaderboard file too 2026-02-03 21:06:12 +00:00