nanochat/nanochat
2026-02-20 08:29:12 -05:00
..
__init__.py initial commit 2025-10-13 06:49:24 -07:00
checkpoint_manager.py tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft 2026-02-18 15:49:18 +00:00
common.py Fix bug in setting precision (#538) 2026-02-18 15:49:18 +00:00
core_eval.py initial commit 2025-10-13 06:49:24 -07:00
dataloader.py slightly more efficient dataloader that reduces the number of python objects flying around and causing strain on runtime and garbage collector 2026-02-02 01:17:30 +00:00
dataset.py initial commit 2025-10-13 06:49:24 -07:00
engine.py fix: pass device_type to compute_init in engine.__main__ (#451) 2026-01-19 17:19:51 -08:00
execution.py nit delete redundant catch/raise in execute 2025-10-29 08:10:03 -07:00
flash_attention.py Add Blackwell (SM100) GPU support via SDPA fallback (#475) 2026-01-31 19:42:58 -08:00
fp8.py Removed redundant qunatization of gradients 2026-02-15 15:41:33 +00:00
gpt.py fix RoPE cache overflow with kv-cache by growing rope buffers 2026-02-20 08:29:12 -05:00
logo.svg initial commit 2025-10-13 06:49:24 -07:00
loss_eval.py fix typos 2025-11-14 11:20:25 +01:00
optim.py bring back an assert guarding against bad param sizing 2026-02-05 18:14:30 +00:00
report.py remove leftover mid references (#491) 2026-02-02 08:33:46 -08:00
tokenizer.py adjust the comment on the regex pattern per recent experimnet see dev/LOG.md 2026-01-13 17:50:39 +00:00
ui.html Fix conversation scroll to bottom on some browsers + remove duplicated padding (#348) 2025-12-31 13:03:22 -08:00