nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-03-22 21:03:22 +00:00

History

geopti e9fb8db8c4 Merge `16755495bc` into `1076f97059`		2026-03-05 10:29:53 +01:00
..
miniseries.sh	fix(miniseries): extract tokens_trained from log instead of hardcoding batch size	2026-02-28 20:43:34 +00:00
runcpu.sh	merge two files base_loss and base_eval into a single file, it's nicer this way, and unify the huggingface code associated with both	2026-02-01 02:36:43 +00:00
scaling_laws.sh	fix: correct CSV extraction in scaling_laws.sh	2026-02-28 16:37:04 +00:00
speedrun.sh	big, breaking change but large upside: swap previous FineWeb-EDU dataset to NVIDIA ClimbMix dataset. Requires people to download the data shards. The upside is that training GPT-2 capablity model now only takes ~2 hours, down from 2.76 hours, so this is a huge win data-wise	2026-03-04 19:47:12 +00:00