nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-06-20 21:19:37 +00:00

History

Aarushi Singh ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458 ) * allow top_k=0 in web api to disable filtering * adding a comment for clear reasoning * adding change to docstring		2026-01-30 09:21:41 -08:00
..
base_eval.py	bugfix	2025-12-26 19:02:12 +08:00
base_loss.py	update the CPU/MPS script to give reasonable results. The model can at least answer that Paris is the capital of France and knows that the sky is blue, for about 40 minutes of training on my macbook. Also fixed a bug that existed due to KVCache bfloat16 dtype assumption	2026-01-17 12:27:30 -08:00
base_train.py	Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help	2026-01-29 00:52:08 +00:00
chat_cli.py	upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming	2025-10-20 10:15:17 -07:00
chat_eval.py	Fix args in readme (#438 )	2026-01-15 16:26:38 -08:00
chat_rl.py	Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help	2026-01-29 00:52:08 +00:00
chat_sft.py	Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help	2026-01-29 00:52:08 +00:00
chat_web.py	feat: allow top_k=0 in web api to disable filtering (#458 )	2026-01-30 09:21:41 -08:00
mid_train.py	Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help	2026-01-29 00:52:08 +00:00
tok_eval.py	initial commit	2025-10-13 06:49:24 -07:00
tok_train.py	quick fix to not OOM main speedrun script	2026-01-26 22:31:42 +00:00