nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-01-26 05:14:13 +00:00

History

svlandeg 38e4e0dd7b Merge branch 'master' into fix/fa3-fallback-mps		2026-01-16 09:59:03 +01:00
..
__init__.py
adamw.py	fuse adamw into a single torch compiled kernel similar to muon. it's about 1.7X faster, but overall it's so tiny that it's not making a major dent	2026-01-15 23:30:44 +00:00
checkpoint_manager.py	minor helpful message	2026-01-15 03:20:21 +00:00
common.py	fix: safe DDP cleanup (check initialized PG, not just env) (#256 )	2025-12-27 20:27:40 -08:00
core_eval.py
dataloader.py	Big DataLoader refactor: BOS-aligned dataloaders with epoch tracking for pre/mid-training	2026-01-13 20:05:47 +00:00
dataset.py
engine.py	integrate Flash Attention 3. +9% tok_per_sec for d12 with ctx even as low as 2048 out of the box nice. also, ready to tune windows huge	2026-01-11 20:33:19 +00:00
execution.py	nit delete redundant catch/raise in execute	2025-10-29 08:10:03 -07:00
gpt.py	feat: restrict FA3 loading to Hopper+ GPUs (SM90+) to fix crashes on consumer hardware	2026-01-14 22:14:42 +01:00
logo.svg
loss_eval.py	fix typos	2025-11-14 11:20:25 +01:00
muon.py	changes and optimizations to muon, making it more efficient and simpler/cleaner a bit	2026-01-15 03:20:48 +00:00
report.py	fix small bug where this would break if git stage has deleted files	2026-01-04 19:11:43 +00:00
tokenizer.py	adjust the comment on the regex pattern per recent experimnet see dev/LOG.md	2026-01-13 17:50:39 +00:00
ui.html	Fix conversation scroll to bottom on some browsers + remove duplicated padding (#348 )	2025-12-31 13:03:22 -08:00