nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-07-02 19:09:15 +00:00

History

Andrej Karpathy 4ddc803797 fix adamw slight bug. this chunk was copy pasted originally from modded-nanogpt, which still seems to have the bug		2026-01-08 18:18:42 +00:00
..
__init__.py	initial commit	2025-10-13 06:49:24 -07:00
adamw.py	fix adamw slight bug. this chunk was copy pasted originally from modded-nanogpt, which still seems to have the bug	2026-01-08 18:18:42 +00:00
checkpoint_manager.py	rename checkpoint_dir to checkpoints_dir for consistency.	2025-12-08 18:32:12 -08:00
common.py	fix: safe DDP cleanup (check initialized PG, not just env) (#256 )	2025-12-27 20:27:40 -08:00
core_eval.py	initial commit	2025-10-13 06:49:24 -07:00
dataloader.py	feat: pad vocab size to 64 for DDP optimizers and efficiency	2025-12-09 12:38:18 +01:00
dataset.py	initial commit	2025-10-13 06:49:24 -07:00
engine.py	delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts	2026-01-04 19:14:23 +00:00
execution.py	nit delete redundant catch/raise in execute	2025-10-29 08:10:03 -07:00
gpt.py	nudge hyperparameters of the base script with the results of the sweeps and miniseries. vocab size down to 32K. D:N ratio from 20 to 8. add miniseries script	2026-01-07 22:11:59 +00:00
logo.svg	initial commit	2025-10-13 06:49:24 -07:00
loss_eval.py	fix typos	2025-11-14 11:20:25 +01:00
muon.py	initial commit	2025-10-13 06:49:24 -07:00
report.py	fix small bug where this would break if git stage has deleted files	2026-01-04 19:11:43 +00:00
tokenizer.py	alright add transformers as a dep of the repo because it should be easy to evaluate the CORE score of HF models. Not super happy about it but i tried it and the uv.lock doesn't get bloated as much as i expected	2026-01-04 20:37:28 +00:00
ui.html	Fix conversation scroll to bottom on some browsers + remove duplicated padding (#348 )	2025-12-31 13:03:22 -08:00