• Joined on 2024-05-31
tacit synced commits to refs/pull/510/merge at tacit/nanochat from mirror 2026-02-18 09:30:24 +00:00
4800c62f6e Fix MockModel's device definition (#535)
4a6e47b0c6 update dev log with recent
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
Compare 4 commits »
tacit synced commits to refs/pull/516/merge at tacit/nanochat from mirror 2026-02-18 09:30:24 +00:00
4800c62f6e Fix MockModel's device definition (#535)
Compare 2 commits »
tacit synced commits to refs/pull/485/merge at tacit/nanochat from mirror 2026-02-18 09:30:24 +00:00
4800c62f6e Fix MockModel's device definition (#535)
4a6e47b0c6 update dev log with recent
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
Compare 4 commits »
tacit synced commits to refs/pull/533/merge at tacit/nanochat from mirror 2026-02-18 09:30:24 +00:00
4800c62f6e Fix MockModel's device definition (#535)
4a6e47b0c6 update dev log with recent
Compare 3 commits »
tacit synced commits to refs/pull/498/merge at tacit/nanochat from mirror 2026-02-18 09:30:24 +00:00
4800c62f6e Fix MockModel's device definition (#535)
Compare 2 commits »
tacit synced commits to refs/pull/414/merge at tacit/nanochat from mirror 2026-02-18 09:30:23 +00:00
4800c62f6e Fix MockModel's device definition (#535)
4a6e47b0c6 update dev log with recent
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 5 commits »
tacit synced commits to refs/pull/536/merge at tacit/nanochat from mirror 2026-02-18 01:20:25 +00:00
4800c62f6e Fix MockModel's device definition (#535)
Compare 2 commits »
tacit synced commits to refs/pull/531/merge at tacit/nanochat from mirror 2026-02-18 01:20:25 +00:00
4800c62f6e Fix MockModel's device definition (#535)
Compare 2 commits »
tacit synced commits to refs/pull/455/merge at tacit/nanochat from mirror 2026-02-18 01:20:24 +00:00
4a6e47b0c6 update dev log with recent
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 4 commits »
tacit synced commits to refs/pull/511/merge at tacit/nanochat from mirror 2026-02-18 01:20:24 +00:00
4a6e47b0c6 update dev log with recent
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
788dadeb88 a number of upgrades to SFT script to bring it up to date w.r.t. pretraining and tuning some of its kwargs based on sweeps
Compare 4 commits »
tacit synced commits to refs/pull/204/merge at tacit/nanochat from mirror 2026-02-18 01:20:24 +00:00
4a6e47b0c6 update dev log with recent
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
Compare 3 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-02-18 01:20:24 +00:00
4800c62f6e Fix MockModel's device definition (#535)
tacit synced commits to refs/pull/516/merge at tacit/nanochat from mirror 2026-02-18 01:20:24 +00:00
4a6e47b0c6 update dev log with recent
Compare 2 commits »
tacit synced commits to refs/pull/526/merge at tacit/nanochat from mirror 2026-02-18 01:20:24 +00:00
4800c62f6e Fix MockModel's device definition (#535)
4a6e47b0c6 update dev log with recent
Compare 3 commits »
tacit synced and deleted reference refs/tags/refs/pull/535/merge at tacit/nanochat from mirror 2026-02-18 01:20:24 +00:00
tacit synced commits to refs/pull/536/merge at tacit/nanochat from mirror 2026-02-17 17:10:23 +00:00
4a6e47b0c6 update dev log with recent
Compare 2 commits »
tacit synced commits to refs/pull/531/merge at tacit/nanochat from mirror 2026-02-17 17:10:23 +00:00
4a6e47b0c6 update dev log with recent
Compare 2 commits »
tacit synced commits to refs/pull/501/merge at tacit/nanochat from mirror 2026-02-17 17:10:23 +00:00
4a6e47b0c6 update dev log with recent
Compare 2 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-02-17 17:10:23 +00:00
4a6e47b0c6 update dev log with recent
tacit synced commits to refs/pull/370/merge at tacit/nanochat from mirror 2026-02-17 17:10:23 +00:00
8180e1d8c1 tune the data mixture a bit, load optimizer by default when SFT. These were confirmed to be best settings from sweeps of sft
Compare 2 commits »