nanochat/nanochat
2026-01-08 18:18:42 +00:00
..
__init__.py initial commit 2025-10-13 06:49:24 -07:00
adamw.py fix adamw slight bug. this chunk was copy pasted originally from modded-nanogpt, which still seems to have the bug 2026-01-08 18:18:42 +00:00
checkpoint_manager.py rename checkpoint_dir to checkpoints_dir for consistency. 2025-12-08 18:32:12 -08:00
common.py fix: safe DDP cleanup (check initialized PG, not just env) (#256) 2025-12-27 20:27:40 -08:00
core_eval.py initial commit 2025-10-13 06:49:24 -07:00
dataloader.py feat: pad vocab size to 64 for DDP optimizers and efficiency 2025-12-09 12:38:18 +01:00
dataset.py initial commit 2025-10-13 06:49:24 -07:00
engine.py delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts 2026-01-04 19:14:23 +00:00
execution.py nit delete redundant catch/raise in execute 2025-10-29 08:10:03 -07:00
gpt.py nudge hyperparameters of the base script with the results of the sweeps and miniseries. vocab size down to 32K. D:N ratio from 20 to 8. add miniseries script 2026-01-07 22:11:59 +00:00
logo.svg initial commit 2025-10-13 06:49:24 -07:00
loss_eval.py fix typos 2025-11-14 11:20:25 +01:00
muon.py initial commit 2025-10-13 06:49:24 -07:00
report.py fix small bug where this would break if git stage has deleted files 2026-01-04 19:11:43 +00:00
tokenizer.py alright add transformers as a dep of the repo because it should be easy to evaluate the CORE score of HF models. Not super happy about it but i tried it and the uv.lock doesn't get bloated as much as i expected 2026-01-04 20:37:28 +00:00
ui.html Fix conversation scroll to bottom on some browsers + remove duplicated padding (#348) 2025-12-31 13:03:22 -08:00