Commit Graph

13 Commits

Author SHA1 Message Date
Andrej Karpathy
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way 2026-01-13 22:45:27 +00:00
Andrej Karpathy
eb7bbc1b66 delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts 2026-01-04 19:14:23 +00:00
Andrej
088726aa7d
clean up model_tag handling across scripts a bit more. 2025-12-27 20:01:09 -08:00
Andrej Karpathy
2874eda59a update to new os env var to get rid of deprecation warning 2025-12-28 03:32:46 +00:00
duwenjie
92c6654b95 bugfix save and load ckpt from model_tag dir 2025-12-21 15:07:04 +08:00
Eric Silberstein
f37d45c21f remove unneeded iter() 2025-11-20 15:14:56 -05:00
svlandeg
70319851fc fix typo 2025-10-29 19:48:34 +01:00
Andrej Karpathy
8892470f29 add the SpellingBee task so that nanochat can count r in strawberry etc. along the way we had to add a bunch of new functionality, e.g. extend the calculator to support the count function of python. possibly the current TaskMixture uses way too many synthetic examples of SpellingBee because the eval gives us exactly 100% performance on spelling. We can tune this later to reclaim some wall clock time here I think 2025-10-24 14:02:48 +00:00
Andrej Karpathy
5bdc99abfb merge and resolve conflict 2025-10-21 17:19:10 +00:00
Andrej Karpathy
fe5aed940b add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok 2025-10-21 15:04:58 +00:00
karpathy
2e9669e03a upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming 2025-10-20 10:15:17 -07:00
Andrej Karpathy
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports 2025-10-15 16:42:23 +00:00
karpathy
3a5e0bc50b initial commit 2025-10-13 06:49:24 -07:00