Commit Graph

12 Commits

Author SHA1 Message Date
CadBane
77593b77d4 Final Commit after extensive Mamba Block Architecture feature add-ons 2025-10-15 11:19:07 +02:00
CadBane
d7c1db6408 Added Mamba architecture support
On branch feature-add-mamba-arch-support
 Changes to be committed:
	new file:   IMPLEMENTATION_SUMMARY.md
	new file:   MAMBA_INTEGRATION.md
	new file:   QUICKSTART_MAMBA.md
	new file:   configs/README.md
	new file:   configs/hybrid_alternating_d20.py
	new file:   configs/hybrid_early_t_late_m_d20.py
	new file:   configs/mamba_d20.py
	new file:   configs/rtx3070_d16.py
	new file:   configs/transformer_d20.py
	new file:   nanochat/blocks/__init__.py
	new file:   nanochat/blocks/mamba_block.py
    new file:   nanochat/blocks/transformer_block.py
	modified:   nanochat/checkpoint_manager.py
	modified:   nanochat/gpt.py
	new file:   tests/test_hybrid_blocks.py
2025-10-15 10:32:22 +02:00
Andrej
67aaca98f5
export NANOCHAT_BASE_DIR so child processes get it too
Export the cache directory so that users can use their own cache location
2025-10-14 16:01:28 -07:00
Zach Mueller
f0855cbcc7
Update speedrun.sh 2025-10-14 14:12:01 -04:00
Andrej
dd6ff9a1cc
fix bug in fallback case of find_largest_model
Fix: Handle missing d<number> model tags in find_largest_model
ty
2025-10-13 14:38:34 -07:00
Mirza-Samad-Ahmed-Baig
afaa5b4c90 Fix: Handle missing d<number> model tags in find_largest_model 2025-10-14 00:24:07 +03:00
Andrej
5fd0b13886
Merge pull request #2 from epoyraz/patch-1
Update README.md
2025-10-13 10:10:15 -07:00
Enes Poyraz
6a795baf27
Update README.md
fix typos
2025-10-13 18:40:12 +02:00
Andrej
626bd3e260
Add image of the WebUI to readme 2025-10-13 08:03:00 -07:00
karpathy
da96b46565 update link to the new discussion 2025-10-13 07:42:09 -07:00
karpathy
a53833d04f add nanochat logo png 2025-10-13 06:59:59 -07:00
karpathy
3a5e0bc50b initial commit 2025-10-13 06:49:24 -07:00