• Joined on 2024-05-31
tacit synced commits to refs/pull/536/merge at tacit/nanochat from mirror 2026-02-17 00:50:25 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
Compare 2 commits »
tacit synced commits to refs/pull/535/merge at tacit/nanochat from mirror 2026-02-17 00:50:25 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
Compare 2 commits »
tacit synced commits to refs/pull/531/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 3 commits »
tacit synced commits to refs/pull/521/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/510/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/516/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
Compare 2 commits »
tacit synced commits to refs/pull/509/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 3 commits »
tacit synced commits to refs/pull/501/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 3 commits »
tacit synced commits to refs/pull/489/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/485/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/498/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
Compare 2 commits »
tacit synced commits to refs/pull/486/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/533/merge at tacit/nanochat from mirror 2026-02-17 00:50:24 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
Compare 2 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-02-17 00:50:23 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
tacit synced commits to refs/pull/141/merge at tacit/nanochat from mirror 2026-02-17 00:50:23 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/204/merge at tacit/nanochat from mirror 2026-02-17 00:50:23 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/370/merge at tacit/nanochat from mirror 2026-02-17 00:50:23 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 2 commits »
tacit synced commits to refs/pull/536/merge at tacit/nanochat from mirror 2026-02-16 16:40:27 +00:00
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
3735eb9723 simplify test.yml
7686d3c7e2 Update test.yml
3185f928d7 Update .github/workflows/test.yml
Compare 5 commits »
tacit synced commits to refs/pull/533/head at tacit/nanochat from mirror 2026-02-16 16:40:26 +00:00
240a60fec2 Add informative error message to batch size assertion
0f3b6a4654 Replace cryptic assertion with descriptive ValueError for batch size alignment
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 3 commits »
tacit synced commits to refs/pull/536/head at tacit/nanochat from mirror 2026-02-16 16:40:26 +00:00
3735eb9723 simplify test.yml
7686d3c7e2 Update test.yml
3185f928d7 Update .github/workflows/test.yml
Compare 3 commits »