• Joined on 2024-05-31
tacit synced commits to refs/pull/449/merge at tacit/nanochat from mirror 2026-01-31 23:59:50 +00:00
1ddaad1c1c nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1
348fbb301b fix dataloader for midtrain to never crop data. we can't just throw it away like we do in pretraining
3c3a3d7042 warmdown of 0.5 is slightly better:
Compare 4 commits »
tacit synced commits to refs/pull/475/head at tacit/nanochat from mirror 2026-01-31 23:59:50 +00:00
4e70a2b678 Update nanochat/flash_attention.py
tacit synced commits to master at tacit/nanochat from mirror 2026-01-31 23:59:50 +00:00
1ddaad1c1c nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1
348fbb301b fix dataloader for midtrain to never crop data. we can't just throw it away like we do in pretraining
Compare 2 commits »
tacit synced commits to refs/pull/475/merge at tacit/nanochat from mirror 2026-01-31 23:59:50 +00:00
4e70a2b678 Update nanochat/flash_attention.py
1ddaad1c1c nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1
348fbb301b fix dataloader for midtrain to never crop data. we can't just throw it away like we do in pretraining
Compare 4 commits »
tacit synced commits to refs/pull/393/merge at tacit/nanochat from mirror 2026-01-31 23:59:50 +00:00
1ddaad1c1c nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1
348fbb301b fix dataloader for midtrain to never crop data. we can't just throw it away like we do in pretraining
3c3a3d7042 warmdown of 0.5 is slightly better:
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
Compare 10 commits »
tacit synced commits to refs/pull/59/merge at tacit/nanochat from mirror 2026-01-31 15:49:50 +00:00
3c3a3d7042 warmdown of 0.5 is slightly better:
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
Compare 11 commits »
tacit synced commits to refs/pull/455/merge at tacit/nanochat from mirror 2026-01-31 15:49:50 +00:00
3c3a3d7042 warmdown of 0.5 is slightly better:
Compare 2 commits »
tacit synced commits to refs/pull/400/merge at tacit/nanochat from mirror 2026-01-31 15:49:49 +00:00
3c3a3d7042 warmdown of 0.5 is slightly better:
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
Compare 8 commits »
tacit synced commits to refs/pull/475/merge at tacit/nanochat from mirror 2026-01-31 07:39:50 +00:00
3c3a3d7042 warmdown of 0.5 is slightly better:
Compare 2 commits »
tacit synced commits to refs/pull/312/merge at tacit/nanochat from mirror 2026-01-31 07:39:50 +00:00
3c3a3d7042 warmdown of 0.5 is slightly better:
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
Compare 8 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-01-31 07:39:49 +00:00
3c3a3d7042 warmdown of 0.5 is slightly better:
tacit synced commits to refs/pull/455/merge at tacit/nanochat from mirror 2026-01-30 23:29:55 +00:00
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
2e17723817 Fix generate() crash when top_k=0 (#467)
Compare 10 commits »
tacit synced commits to refs/pull/449/merge at tacit/nanochat from mirror 2026-01-30 23:29:53 +00:00
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
2e17723817 Fix generate() crash when top_k=0 (#467)
Compare 7 commits »
tacit synced commits to refs/pull/442/merge at tacit/nanochat from mirror 2026-01-30 23:29:52 +00:00
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
2e17723817 Fix generate() crash when top_k=0 (#467)
Compare 10 commits »
tacit synced commits to refs/pull/437/merge at tacit/nanochat from mirror 2026-01-30 23:29:52 +00:00
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
2e17723817 Fix generate() crash when top_k=0 (#467)
Compare 7 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-01-30 23:29:51 +00:00
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
2e17723817 Fix generate() crash when top_k=0 (#467)
02baa15405 i am feeling in a delete mood today. i need to delete a lot of code. there is too much code and surface area and complexity. ew
Compare 6 commits »
tacit synced commits to refs/pull/370/merge at tacit/nanochat from mirror 2026-01-30 23:29:51 +00:00
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
2e17723817 Fix generate() crash when top_k=0 (#467)
Compare 7 commits »
tacit synced commits to refs/pull/399/merge at tacit/nanochat from mirror 2026-01-30 23:29:51 +00:00
4d8dbaf6e0 Fix escape character in README bibtex entry (#454)
3ba42e8135 Fix SDPA KV-cache decode to respect sliding window (#456)
ace6740bdd feat: allow top_k=0 in web api to disable filtering (#458)
2e17723817 Fix generate() crash when top_k=0 (#467)
Compare 10 commits »
tacit synced commits to refs/pull/311/merge at tacit/nanochat from mirror 2026-01-30 23:29:51 +00:00
067daa7758 small fix cpu script ty PR #474
6a341f2ecf contiguous views and single HtoD transfer for inputs/targets much cleaner
ebd4d9bbf5 tried muonh, appealing but didn't work out of the box
Compare 4 commits »
tacit synced commits to refs/pull/312/merge at tacit/nanochat from mirror 2026-01-30 23:29:51 +00:00
067daa7758 small fix cpu script ty PR #474
6a341f2ecf contiguous views and single HtoD transfer for inputs/targets much cleaner
ebd4d9bbf5 tried muonh, appealing but didn't work out of the box
Compare 4 commits »