nanochat/nanochat
2025-11-01 16:04:38 +00:00
..
__init__.py initial commit 2025-10-13 06:49:24 -07:00
adamw.py fix: remove unnecessary tensor allocation in DistAdamW optimizer 2025-10-20 12:03:26 +03:00
checkpoint_manager.py Fix: Handle missing d<number> model tags in find_largest_model 2025-10-14 00:24:07 +03:00
common.py move eval bundle download to be lazy and inside the python code so that we can substantially simplify the run bash scripts 2025-11-01 16:04:38 +00:00
configurator.py initial commit 2025-10-13 06:49:24 -07:00
core_eval.py initial commit 2025-10-13 06:49:24 -07:00
dataloader.py Fix Torch crash caused by pinning on CPU 2025-10-22 16:25:36 +00:00
dataset.py initial commit 2025-10-13 06:49:24 -07:00
engine.py tiny fix to comment 2025-11-01 07:43:57 -07:00
execution.py nit delete redundant catch/raise in execute 2025-10-29 08:10:03 -07:00
gpt.py use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available 2025-10-21 18:07:33 +00:00
logo.svg initial commit 2025-10-13 06:49:24 -07:00
loss_eval.py Merge pull request #35 from bhaskar0210s/master 2025-10-29 08:06:24 -07:00
muon.py initial commit 2025-10-13 06:49:24 -07:00
report.py many small tweaks. base, eval, core work now i think 2025-10-16 15:46:18 -07:00
tokenizer.py allow the tokenizer visualize_tokenization to also print the exact token id. you can never be paranoid enough 2025-10-24 13:27:05 +00:00
ui.html fix(ui): prevent iOS Safari toolbar from covering input on initial load 2025-10-21 17:34:40 -07:00