nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-05-19 22:27:37 +00:00

History

Amrit Bulusu 8b9a23aa92 Merge upstream master into mup_nanochat branch Resolve conflicts in gpt.py and base_train.py: - gpt.py: SP path uses Andrej's exact code, muP path layers our changes on top - base_train.py: adopt Andrej's new defaults (warmup-steps, warmdown-ratio, etc.), keep --use-mup/--base-width args Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-03-12 20:34:02 -04:00
..
base_eval.py	delete autocast, an unnecessary thorn in my side, manage dtypes directly	2026-03-04 23:55:30 +00:00
base_train.py	Merge upstream master into mup_nanochat branch	2026-03-12 20:34:02 -04:00
chat_cli.py	delete autocast, an unnecessary thorn in my side, manage dtypes directly	2026-03-04 23:55:30 +00:00
chat_eval.py	delete autocast, an unnecessary thorn in my side, manage dtypes directly	2026-03-04 23:55:30 +00:00
chat_rl.py	delete autocast, an unnecessary thorn in my side, manage dtypes directly	2026-03-04 23:55:30 +00:00
chat_sft.py	delete autocast, an unnecessary thorn in my side, manage dtypes directly	2026-03-04 23:55:30 +00:00
chat_web.py	delete autocast, an unnecessary thorn in my side, manage dtypes directly	2026-03-04 23:55:30 +00:00
mup_coord_check.py	adding muP considerations for muon	2026-03-12 00:45:48 -04:00
mup_transfer_check.py	adding muP considerations for muon	2026-03-12 00:45:48 -04:00
tok_eval.py	initial commit	2025-10-13 06:49:24 -07:00
tok_train.py	quick fix to not OOM main speedrun script	2026-01-26 22:31:42 +00:00