• Joined on 2024-05-31
tacit synced commits to refs/pull/151/head at tacit/nanochat from mirror 2026-02-13 14:09:52 +00:00
f957c8a2ec transformers is already in the repo's deps
tacit synced commits to refs/pull/296/merge at tacit/nanochat from mirror 2026-02-13 14:09:52 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
e569b59f92 delete torchao dependency, create our own exact API-matched version of Float8Linear, document it very well. for some poorly understood reason, the performance is not only ~identical but actually runs 3% faster. despite of it being significantly simpler and much less code. i don't fully understand why/how atm
1ec0a34779 at 28 and above we start to need batch size 8
ff46300720 tune miniseries just a bit, fairly cosmetic, keep to even depths where the math works out nicely in model sizing
Compare 15 commits »
tacit synced and deleted reference refs/tags/refs/pull/399/merge at tacit/nanochat from mirror 2026-02-13 14:09:52 +00:00
tacit synced and deleted reference refs/tags/refs/pull/59/merge at tacit/nanochat from mirror 2026-02-13 14:09:52 +00:00
tacit synced and deleted reference refs/tags/refs/pull/128/merge at tacit/nanochat from mirror 2026-02-13 14:09:51 +00:00
tacit synced and deleted reference refs/tags/refs/pull/32/merge at tacit/nanochat from mirror 2026-02-13 14:09:51 +00:00
tacit synced commits to refs/pull/498/merge at tacit/nanochat from mirror 2026-02-13 05:59:52 +00:00
330fa1188c Merge origin/master into muonh
25ec1e6c43 Merge branch 'master' into muonh-submit
116900ac16 muonh
5a965c1383 Remove runs/scaling_laws_muonh.sh
Compare 21 commits »
tacit synced commits to refs/pull/498/head at tacit/nanochat from mirror 2026-02-13 05:59:52 +00:00
330fa1188c Merge origin/master into muonh
25ec1e6c43 Merge branch 'master' into muonh-submit
116900ac16 muonh
5a965c1383 Remove runs/scaling_laws_muonh.sh
fe2a80badd Replace torchao with minimal custom FP8 implementation
Compare 27 commits »
tacit created branch main in tacit/keys 2026-02-12 10:57:28 +00:00
tacit pushed to main at tacit/keys 2026-02-12 10:57:28 +00:00
tacit created repository tacit/keys 2026-02-12 10:56:51 +00:00
tacit synced commits to refs/pull/515/merge at tacit/nanochat from mirror 2026-02-11 21:19:51 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
Compare 2 commits »
tacit synced commits to refs/pull/486/merge at tacit/nanochat from mirror 2026-02-11 21:19:51 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
Compare 2 commits »
tacit synced commits to refs/pull/425/merge at tacit/nanochat from mirror 2026-02-11 21:19:51 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
e569b59f92 delete torchao dependency, create our own exact API-matched version of Float8Linear, document it very well. for some poorly understood reason, the performance is not only ~identical but actually runs 3% faster. despite of it being significantly simpler and much less code. i don't fully understand why/how atm
Compare 3 commits »
tacit synced commits to refs/pull/414/merge at tacit/nanochat from mirror 2026-02-11 21:19:51 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
e569b59f92 delete torchao dependency, create our own exact API-matched version of Float8Linear, document it very well. for some poorly understood reason, the performance is not only ~identical but actually runs 3% faster. despite of it being significantly simpler and much less code. i don't fully understand why/how atm
1ec0a34779 at 28 and above we start to need batch size 8
ff46300720 tune miniseries just a bit, fairly cosmetic, keep to even depths where the math works out nicely in model sizing
Compare 7 commits »
tacit synced commits to refs/pull/511/merge at tacit/nanochat from mirror 2026-02-11 13:09:55 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
Compare 2 commits »
tacit synced commits to refs/pull/483/merge at tacit/nanochat from mirror 2026-02-11 13:09:55 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
Compare 2 commits »
tacit synced commits to refs/pull/509/merge at tacit/nanochat from mirror 2026-02-11 13:09:55 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
Compare 2 commits »
tacit synced commits to refs/pull/510/merge at tacit/nanochat from mirror 2026-02-11 13:09:55 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
Compare 2 commits »
tacit synced commits to refs/pull/489/merge at tacit/nanochat from mirror 2026-02-11 13:09:55 +00:00
2f09686724 clarify that this is bf16 mfu we're talking about
Compare 2 commits »