Commit Graph

10 Commits

Author SHA1 Message Date
svlandeg
a58bbbaf59 Merge branch 'master' into mps-support 2026-01-18 14:16:54 +01:00
karpathy
f9a7e0f111 update the CPU/MPS script to give reasonable results. The model can at least answer that Paris is the capital of France and knows that the sky is blue, for about 40 minutes of training on my macbook. Also fixed a bug that existed due to KVCache bfloat16 dtype assumption 2026-01-17 12:27:30 -08:00
Andrej Karpathy
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way 2026-01-13 22:45:27 +00:00
Andrej Karpathy
3b50b77ed3 fix base_loss to report correct loss by switching the dataloader to the new default 2026-01-13 22:09:36 +00:00
Andrej Karpathy
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway 2026-01-12 03:10:13 +00:00
Andrej Karpathy
eb7bbc1b66 delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts 2026-01-04 19:14:23 +00:00
bedwards
a430ed5a63 enable autocast for mps device 2025-11-20 19:58:29 -06:00
karpathy
df600b6ed5 many small tweaks. base, eval, core work now i think 2025-10-16 15:46:18 -07:00
karpathy
786119d593 add autodetect of device and related stuff. getting weird warnings/errors still, so wip 2025-10-16 10:26:19 -07:00
karpathy
3a5e0bc50b initial commit 2025-10-13 06:49:24 -07:00