• Joined on 2024-05-31
tacit synced commits to refs/pull/399/merge at tacit/nanochat from mirror 2026-01-16 08:24:05 +00:00
50413d2d67 typo in comments: change "GAPO" to "DAPO"
fbf2bbea25 update log with a bunch of attempts
747ed4491f add negative result on olmo3 pretraining mix
7d1700c521 add zstd lib
Compare 9 commits »
tacit synced commits to refs/pull/400/merge at tacit/nanochat from mirror 2026-01-16 08:24:05 +00:00
50413d2d67 typo in comments: change "GAPO" to "DAPO"
fbf2bbea25 update log with a bunch of attempts
747ed4491f add negative result on olmo3 pretraining mix
7d1700c521 add zstd lib
Compare 9 commits »
tacit synced commits to refs/pull/393/merge at tacit/nanochat from mirror 2026-01-16 08:24:05 +00:00
50413d2d67 typo in comments: change "GAPO" to "DAPO"
fbf2bbea25 update log with a bunch of attempts
747ed4491f add negative result on olmo3 pretraining mix
7d1700c521 add zstd lib
Compare 9 commits »
tacit synced commits to refs/pull/204/merge at tacit/nanochat from mirror 2026-01-16 08:24:04 +00:00
50413d2d67 typo in comments: change "GAPO" to "DAPO"
fbf2bbea25 update log with a bunch of attempts
747ed4491f add negative result on olmo3 pretraining mix
7d1700c521 add zstd lib
Compare 9 commits »
tacit synced commits to refs/pull/151/merge at tacit/nanochat from mirror 2026-01-16 08:24:03 +00:00
50413d2d67 typo in comments: change "GAPO" to "DAPO"
fbf2bbea25 update log with a bunch of attempts
747ed4491f add negative result on olmo3 pretraining mix
7d1700c521 add zstd lib
Compare 9 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-01-16 08:24:02 +00:00
50413d2d67 typo in comments: change "GAPO" to "DAPO"
fbf2bbea25 update log with a bunch of attempts
747ed4491f add negative result on olmo3 pretraining mix
7d1700c521 add zstd lib
d4ea28d4e2 Fix args in readme (#438)
Compare 5 commits »
tacit synced commits to refs/pull/141/merge at tacit/nanochat from mirror 2026-01-16 08:24:02 +00:00
50413d2d67 typo in comments: change "GAPO" to "DAPO"
fbf2bbea25 update log with a bunch of attempts
747ed4491f add negative result on olmo3 pretraining mix
7d1700c521 add zstd lib
Compare 9 commits »
tacit synced and deleted reference refs/tags/refs/pull/412/merge at tacit/nanochat from mirror 2026-01-16 08:24:01 +00:00
tacit synced and deleted reference refs/tags/refs/pull/438/merge at tacit/nanochat from mirror 2026-01-16 08:24:01 +00:00
tacit synced commits to refs/pull/425/merge at tacit/nanochat from mirror 2026-01-16 00:14:07 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »
tacit synced commits to refs/pull/414/merge at tacit/nanochat from mirror 2026-01-16 00:14:07 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »
tacit synced commits to refs/pull/85/merge at tacit/nanochat from mirror 2026-01-16 00:14:07 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »
tacit synced commits to refs/pull/429/merge at tacit/nanochat from mirror 2026-01-16 00:14:07 +00:00
745b156a0b Merge branch 'master' into fix/shard_count
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
Compare 8 commits »
tacit synced commits to refs/pull/431/merge at tacit/nanochat from mirror 2026-01-16 00:14:07 +00:00
bdcc030ffa oops legacy spurious line now
22a71aa3d3 fuse adamw into a single torch compiled kernel similar to muon. it's about 1.7X faster, but overall it's so tiny that it's not making a major dent
255f8b9af6 cleanly separate cpu and gpu sections
Compare 4 commits »
tacit synced commits to refs/pull/438/head at tacit/nanochat from mirror 2026-01-16 00:14:07 +00:00
785b214b84 add required -i flag to chat_eval example runs
a91ad6b4b1 Merge branch 'master' into fix/args
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 4 commits »
tacit synced commits to refs/pull/429/head at tacit/nanochat from mirror 2026-01-16 00:14:07 +00:00
745b156a0b Merge branch 'master' into fix/shard_count
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
3b50b77ed3 fix base_loss to report correct loss by switching the dataloader to the new default
Compare 10 commits »
tacit synced commits to refs/pull/438/merge at tacit/nanochat from mirror 2026-01-16 00:14:07 +00:00
785b214b84 add required -i flag to chat_eval example runs
a91ad6b4b1 Merge branch 'master' into fix/args
Compare 3 commits »
tacit synced commits to refs/pull/400/head at tacit/nanochat from mirror 2026-01-16 00:14:06 +00:00
a39c303912 Merge branch 'master' into fix/grad_acc_norm
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
3b50b77ed3 fix base_loss to report correct loss by switching the dataloader to the new default
Compare 45 commits »
tacit synced commits to refs/pull/409/merge at tacit/nanochat from mirror 2026-01-16 00:14:06 +00:00
bdcc030ffa oops legacy spurious line now
22a71aa3d3 fuse adamw into a single torch compiled kernel similar to muon. it's about 1.7X faster, but overall it's so tiny that it's not making a major dent
255f8b9af6 cleanly separate cpu and gpu sections
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
Compare 6 commits »
tacit synced commits to refs/pull/412/merge at tacit/nanochat from mirror 2026-01-16 00:14:06 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »