Andrej Karpathy
|
7312ec9898
|
fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
|
2026-01-13 22:45:27 +00:00 |
|
Andrej Karpathy
|
eb7bbc1b66
|
delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts
|
2026-01-04 19:14:23 +00:00 |
|
Andrej
|
088726aa7d
|
clean up model_tag handling across scripts a bit more.
|
2025-12-27 20:01:09 -08:00 |
|
Andrej Karpathy
|
2874eda59a
|
update to new os env var to get rid of deprecation warning
|
2025-12-28 03:32:46 +00:00 |
|
duwenjie
|
92c6654b95
|
bugfix save and load ckpt from model_tag dir
|
2025-12-21 15:07:04 +08:00 |
|
Eric Silberstein
|
f37d45c21f
|
remove unneeded iter()
|
2025-11-20 15:14:56 -05:00 |
|
svlandeg
|
70319851fc
|
fix typo
|
2025-10-29 19:48:34 +01:00 |
|
Andrej Karpathy
|
8892470f29
|
add the SpellingBee task so that nanochat can count r in strawberry etc. along the way we had to add a bunch of new functionality, e.g. extend the calculator to support the count function of python. possibly the current TaskMixture uses way too many synthetic examples of SpellingBee because the eval gives us exactly 100% performance on spelling. We can tune this later to reclaim some wall clock time here I think
|
2025-10-24 14:02:48 +00:00 |
|
Andrej Karpathy
|
5bdc99abfb
|
merge and resolve conflict
|
2025-10-21 17:19:10 +00:00 |
|
Andrej Karpathy
|
fe5aed940b
|
add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok
|
2025-10-21 15:04:58 +00:00 |
|
karpathy
|
2e9669e03a
|
upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming
|
2025-10-20 10:15:17 -07:00 |
|
Andrej Karpathy
|
190d9515d0
|
dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
|
2025-10-15 16:42:23 +00:00 |
|
karpathy
|
3a5e0bc50b
|
initial commit
|
2025-10-13 06:49:24 -07:00 |
|