nanochat/nanochat
2025-10-16 15:46:18 -07:00
..
__init__.py initial commit 2025-10-13 06:49:24 -07:00
adamw.py initial commit 2025-10-13 06:49:24 -07:00
checkpoint_manager.py Fix: Handle missing d<number> model tags in find_largest_model 2025-10-14 00:24:07 +03:00
common.py many small tweaks. base, eval, core work now i think 2025-10-16 15:46:18 -07:00
configurator.py initial commit 2025-10-13 06:49:24 -07:00
core_eval.py initial commit 2025-10-13 06:49:24 -07:00
dataloader.py trying to add basic cpu support, will try mps too 2025-10-16 16:14:38 +00:00
dataset.py initial commit 2025-10-13 06:49:24 -07:00
engine.py initial commit 2025-10-13 06:49:24 -07:00
execution.py initial commit 2025-10-13 06:49:24 -07:00
gpt.py add support for CPU and for MPS. I had to change a few cosmetic things. I also discovered I think a bit of a bug, where I was casting wte to bfloat16 in the wrong place (the model init) instead of in init_weights 2025-10-16 10:04:43 -07:00
logo.svg initial commit 2025-10-13 06:49:24 -07:00
loss_eval.py add support for CPU and for MPS. I had to change a few cosmetic things. I also discovered I think a bit of a bug, where I was casting wte to bfloat16 in the wrong place (the model init) instead of in init_weights 2025-10-16 10:04:43 -07:00
muon.py initial commit 2025-10-13 06:49:24 -07:00
report.py many small tweaks. base, eval, core work now i think 2025-10-16 15:46:18 -07:00
tokenizer.py initial commit 2025-10-13 06:49:24 -07:00
ui.html also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate 2025-10-16 01:28:37 +00:00