• Joined on 2024-05-31
tacit synced commits to refs/pull/437/merge at tacit/nanochat from mirror 2026-02-02 08:40:00 +00:00
230d6cf6c6 tune the synthetic data generation script. delete the king andrej stuff lol. also, upgrade to gemini 3
07c4dd4cd9 manually control the over-active garbage collector, save a small few minutes from a typical run
e8fec97d4c slightly more efficient dataloader that reduces the number of python objects flying around and causing strain on runtime and garbage collector
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
Compare 14 commits »
tacit synced commits to refs/pull/393/merge at tacit/nanochat from mirror 2026-02-02 08:39:59 +00:00
230d6cf6c6 tune the synthetic data generation script. delete the king andrej stuff lol. also, upgrade to gemini 3
07c4dd4cd9 manually control the over-active garbage collector, save a small few minutes from a typical run
e8fec97d4c slightly more efficient dataloader that reduces the number of python objects flying around and causing strain on runtime and garbage collector
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
Compare 11 commits »
tacit synced commits to refs/pull/409/merge at tacit/nanochat from mirror 2026-02-02 08:39:59 +00:00
230d6cf6c6 tune the synthetic data generation script. delete the king andrej stuff lol. also, upgrade to gemini 3
07c4dd4cd9 manually control the over-active garbage collector, save a small few minutes from a typical run
e8fec97d4c slightly more efficient dataloader that reduces the number of python objects flying around and causing strain on runtime and garbage collector
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
Compare 6 commits »
tacit synced commits to refs/pull/328/merge at tacit/nanochat from mirror 2026-02-02 08:39:58 +00:00
230d6cf6c6 tune the synthetic data generation script. delete the king andrej stuff lol. also, upgrade to gemini 3
07c4dd4cd9 manually control the over-active garbage collector, save a small few minutes from a typical run
e8fec97d4c slightly more efficient dataloader that reduces the number of python objects flying around and causing strain on runtime and garbage collector
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
Compare 6 commits »
tacit synced commits to refs/pull/151/merge at tacit/nanochat from mirror 2026-02-02 08:39:57 +00:00
230d6cf6c6 tune the synthetic data generation script. delete the king andrej stuff lol. also, upgrade to gemini 3
07c4dd4cd9 manually control the over-active garbage collector, save a small few minutes from a typical run
e8fec97d4c slightly more efficient dataloader that reduces the number of python objects flying around and causing strain on runtime and garbage collector
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
Compare 33 commits »
tacit synced commits to refs/pull/141/merge at tacit/nanochat from mirror 2026-02-02 08:39:56 +00:00
230d6cf6c6 tune the synthetic data generation script. delete the king andrej stuff lol. also, upgrade to gemini 3
07c4dd4cd9 manually control the over-active garbage collector, save a small few minutes from a typical run
e8fec97d4c slightly more efficient dataloader that reduces the number of python objects flying around and causing strain on runtime and garbage collector
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
Compare 6 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-02-02 08:39:55 +00:00
230d6cf6c6 tune the synthetic data generation script. delete the king andrej stuff lol. also, upgrade to gemini 3
07c4dd4cd9 manually control the over-active garbage collector, save a small few minutes from a typical run
e8fec97d4c slightly more efficient dataloader that reduces the number of python objects flying around and causing strain on runtime and garbage collector
Compare 3 commits »
tacit synced commits to refs/pull/486/merge at tacit/nanochat from mirror 2026-02-02 00:29:57 +00:00
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
eaf49a33c8 fix path which i think was modified during the refactor and this is a bug introduced by claude i believe
Compare 3 commits »
tacit synced commits to refs/pull/485/merge at tacit/nanochat from mirror 2026-02-02 00:29:57 +00:00
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
eaf49a33c8 fix path which i think was modified during the refactor and this is a bug introduced by claude i believe
Compare 3 commits »
tacit synced commits to refs/pull/484/merge at tacit/nanochat from mirror 2026-02-02 00:29:57 +00:00
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
eaf49a33c8 fix path which i think was modified during the refactor and this is a bug introduced by claude i believe
Compare 3 commits »
tacit synced commits to refs/pull/483/merge at tacit/nanochat from mirror 2026-02-02 00:29:56 +00:00
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
eaf49a33c8 fix path which i think was modified during the refactor and this is a bug introduced by claude i believe
Compare 3 commits »
tacit synced commits to refs/pull/328/merge at tacit/nanochat from mirror 2026-02-02 00:29:55 +00:00
31b61d2d17 fix broken import sigh
4d6415b8ef use _PEAK_FLOPS_TABLE instead of if-else structure (#479)
43078c347e clean up original tokenizing_distributed_data_loader (#478)
dc291c627f Add Blackwell (SM100) GPU support via SDPA fallback (#475)
Compare 28 commits »
tacit synced commits to refs/pull/370/merge at tacit/nanochat from mirror 2026-02-02 00:29:55 +00:00
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
eaf49a33c8 fix path which i think was modified during the refactor and this is a bug introduced by claude i believe
31b61d2d17 fix broken import sigh
4d6415b8ef use _PEAK_FLOPS_TABLE instead of if-else structure (#479)
Compare 11 commits »
tacit synced commits to refs/pull/141/merge at tacit/nanochat from mirror 2026-02-02 00:29:53 +00:00
31b61d2d17 fix broken import sigh
4d6415b8ef use _PEAK_FLOPS_TABLE instead of if-else structure (#479)
43078c347e clean up original tokenizing_distributed_data_loader (#478)
dc291c627f Add Blackwell (SM100) GPU support via SDPA fallback (#475)
Compare 18 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-02-02 00:29:52 +00:00
8b4849d548 fix bug in chat_sft, the attention window must be preserved sigh
eaf49a33c8 fix path which i think was modified during the refactor and this is a bug introduced by claude i believe
Compare 2 commits »
tacit synced commits to refs/pull/296/merge at tacit/nanochat from mirror 2026-02-01 16:19:51 +00:00
31b61d2d17 fix broken import sigh
4d6415b8ef use _PEAK_FLOPS_TABLE instead of if-else structure (#479)
43078c347e clean up original tokenizing_distributed_data_loader (#478)
dc291c627f Add Blackwell (SM100) GPU support via SDPA fallback (#475)
Compare 26 commits »
tacit synced commits to refs/pull/409/merge at tacit/nanochat from mirror 2026-02-01 16:19:51 +00:00
31b61d2d17 fix broken import sigh
4d6415b8ef use _PEAK_FLOPS_TABLE instead of if-else structure (#479)
43078c347e clean up original tokenizing_distributed_data_loader (#478)
dc291c627f Add Blackwell (SM100) GPU support via SDPA fallback (#475)
Compare 7 commits »
tacit synced commits to refs/pull/480/merge at tacit/nanochat from mirror 2026-02-01 08:09:51 +00:00
31b61d2d17 fix broken import sigh
4d6415b8ef use _PEAK_FLOPS_TABLE instead of if-else structure (#479)
43078c347e clean up original tokenizing_distributed_data_loader (#478)
dc291c627f Add Blackwell (SM100) GPU support via SDPA fallback (#475)
Compare 6 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-02-01 08:09:50 +00:00
31b61d2d17 fix broken import sigh
4d6415b8ef use _PEAK_FLOPS_TABLE instead of if-else structure (#479)
43078c347e clean up original tokenizing_distributed_data_loader (#478)
dc291c627f Add Blackwell (SM100) GPU support via SDPA fallback (#475)
0307997f9b merge two files base_loss and base_eval into a single file, it's nicer this way, and unify the huggingface code associated with both
Compare 5 commits »
tacit synced commits to refs/pull/477/head at tacit/nanochat from mirror 2026-02-01 08:09:50 +00:00
c9f9cc928e Reverting config to existing
232d1341be Garbage collect after step 1 and freeze
Compare 2 commits »