• Joined on 2024-05-31
tacit synced commits to refs/pull/204/merge at tacit/nanochat from mirror 2026-02-17 00:50:23 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/370/merge at tacit/nanochat from mirror 2026-02-17 00:50:23 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/536/merge at tacit/nanochat from mirror 2026-02-16 16:40:27 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
3735eb9723 simplify test.yml
7686d3c7e2 Update test.yml
3185f928d7 Update .github/workflows/test.yml
Compare 5 commits »
tacit synced commits to refs/pull/535/merge at tacit/nanochat from mirror 2026-02-16 16:40:26 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/533/head at tacit/nanochat from mirror 2026-02-16 16:40:26 +00:00
240a60fec2 Add informative error message to batch size assertion
0f3b6a4654 Replace cryptic assertion with descriptive ValueError for batch size alignment
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 3 commits »
tacit synced commits to refs/pull/533/merge at tacit/nanochat from mirror 2026-02-16 16:40:26 +00:00
240a60fec2 Add informative error message to batch size assertion
0f3b6a4654 Replace cryptic assertion with descriptive ValueError for batch size alignment
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 4 commits »
tacit synced commits to refs/pull/536/head at tacit/nanochat from mirror 2026-02-16 16:40:26 +00:00
3735eb9723 simplify test.yml
7686d3c7e2 Update test.yml
3185f928d7 Update .github/workflows/test.yml
Compare 3 commits »
tacit synced commits to refs/pull/516/merge at tacit/nanochat from mirror 2026-02-16 16:40:25 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-02-16 16:40:25 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
tacit synced commits to refs/pull/498/merge at tacit/nanochat from mirror 2026-02-16 16:40:25 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/520/merge at tacit/nanochat from mirror 2026-02-16 16:40:25 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/526/merge at tacit/nanochat from mirror 2026-02-16 16:40:25 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced and deleted reference refs/tags/refs/pull/151/merge at tacit/nanochat from mirror 2026-02-16 16:40:24 +00:00
tacit synced and deleted reference refs/tags/refs/pull/492/merge at tacit/nanochat from mirror 2026-02-16 00:20:29 +00:00
tacit synced and deleted reference refs/tags/refs/pull/515/merge at tacit/nanochat from mirror 2026-02-16 00:20:29 +00:00
tacit synced and deleted reference refs/tags/refs/pull/512/merge at tacit/nanochat from mirror 2026-02-16 00:20:29 +00:00
tacit synced and deleted reference refs/tags/refs/pull/477/merge at tacit/nanochat from mirror 2026-02-16 00:20:29 +00:00
tacit synced commits to refs/pull/455/merge at tacit/nanochat from mirror 2026-02-13 14:09:54 +00:00
28d5052b0e Merge branch 'master' into fix/cpu_report
2f09686724 clarify that this is bf16 mfu we're talking about
e569b59f92 delete torchao dependency, create our own exact API-matched version of Float8Linear, document it very well. for some poorly understood reason, the performance is not only ~identical but actually runs 3% faster. despite of it being significantly simpler and much less code. i don't fully understand why/how atm
1ec0a34779 at 28 and above we start to need batch size 8
Compare 40 commits »
tacit synced commits to refs/pull/455/head at tacit/nanochat from mirror 2026-02-13 14:09:54 +00:00
28d5052b0e Merge branch 'master' into fix/cpu_report
2f09686724 clarify that this is bf16 mfu we're talking about
e569b59f92 delete torchao dependency, create our own exact API-matched version of Float8Linear, document it very well. for some poorly understood reason, the performance is not only ~identical but actually runs 3% faster. despite of it being significantly simpler and much less code. i don't fully understand why/how atm
1ec0a34779 at 28 and above we start to need batch size 8
ff46300720 tune miniseries just a bit, fairly cosmetic, keep to even depths where the math works out nicely in model sizing
Compare 53 commits »
tacit synced commits to refs/pull/510/merge at tacit/nanochat from mirror 2026-02-13 14:09:54 +00:00
26bc859fc7 Merge branch 'master' into fix/comment
Compare 2 commits »