nanochat/nanochat
xiayan0118 6a477eedbd
fix: pass device_type to compute_init in engine.__main__ (#451)
When running engine.py directly on non-GPU devices (CPU, MPS),
compute_init() needs the device_type parameter to initialize correctly.
This fixes failures on machines without CUDA support.
2026-01-19 17:19:51 -08:00
..
__init__.py
adamw.py fuse adamw into a single torch compiled kernel similar to muon. it's about 1.7X faster, but overall it's so tiny that it's not making a major dent 2026-01-15 23:30:44 +00:00
checkpoint_manager.py minor helpful message 2026-01-15 03:20:21 +00:00
common.py more GPU types from PR 147 thanks @Qubitium 2026-01-17 03:22:20 +00:00
core_eval.py
dataloader.py Reduce token waste in BOS bestfit by cropping shortest doc (#445) 2026-01-16 18:50:34 -08:00
dataset.py
engine.py fix: pass device_type to compute_init in engine.__main__ (#451) 2026-01-19 17:19:51 -08:00
execution.py nit delete redundant catch/raise in execute 2025-10-29 08:10:03 -07:00
flash_attention.py naturally i failed to include the actual code in the previous commit facepalm 2026-01-16 17:39:41 +00:00
gpt.py update the default GPTConfig kwargs otherwise they are confusing 2026-01-17 21:16:46 +00:00
logo.svg
loss_eval.py fix typos 2025-11-14 11:20:25 +01:00
muon.py changes and optimizations to muon, making it more efficient and simpler/cleaner a bit 2026-01-15 03:20:48 +00:00
report.py fix small bug where this would break if git stage has deleted files 2026-01-04 19:11:43 +00:00
tokenizer.py adjust the comment on the regex pattern per recent experimnet see dev/LOG.md 2026-01-13 17:50:39 +00:00
ui.html Fix conversation scroll to bottom on some browsers + remove duplicated padding (#348) 2025-12-31 13:03:22 -08:00