• Joined on 2024-05-31
tacit synced commits to refs/pull/409/merge at tacit/nanochat from mirror 2026-01-16 00:14:06 +00:00
bdcc030ffa oops legacy spurious line now
22a71aa3d3 fuse adamw into a single torch compiled kernel similar to muon. it's about 1.7X faster, but overall it's so tiny that it's not making a major dent
255f8b9af6 cleanly separate cpu and gpu sections
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
Compare 6 commits »
tacit synced commits to refs/pull/400/head at tacit/nanochat from mirror 2026-01-16 00:14:06 +00:00
a39c303912 Merge branch 'master' into fix/grad_acc_norm
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
3b50b77ed3 fix base_loss to report correct loss by switching the dataloader to the new default
Compare 45 commits »
tacit synced commits to refs/pull/400/merge at tacit/nanochat from mirror 2026-01-16 00:14:06 +00:00
a39c303912 Merge branch 'master' into fix/grad_acc_norm
Compare 2 commits »
tacit synced commits to refs/pull/407/merge at tacit/nanochat from mirror 2026-01-16 00:14:06 +00:00
9de0a121c5 Merge branch 'master' into fix-wandb-for-local-run
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
Compare 11 commits »
tacit synced commits to refs/pull/396/merge at tacit/nanochat from mirror 2026-01-16 00:14:05 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »
tacit synced commits to refs/pull/399/head at tacit/nanochat from mirror 2026-01-16 00:14:05 +00:00
ff126c085e Merge branch 'master' into fix/loop
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
3b50b77ed3 fix base_loss to report correct loss by switching the dataloader to the new default
Compare 45 commits »
tacit synced commits to refs/pull/324/merge at tacit/nanochat from mirror 2026-01-16 00:14:05 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »
tacit synced commits to refs/pull/312/merge at tacit/nanochat from mirror 2026-01-16 00:14:05 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »
tacit synced commits to refs/pull/328/merge at tacit/nanochat from mirror 2026-01-16 00:14:05 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
3b50b77ed3 fix base_loss to report correct loss by switching the dataloader to the new default
Compare 10 commits »
tacit synced commits to refs/pull/393/merge at tacit/nanochat from mirror 2026-01-16 00:14:05 +00:00
89d2741cba Merge branch 'master' into issue-183-nvshmem-install-fix
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
Compare 38 commits »
tacit synced commits to refs/pull/393/head at tacit/nanochat from mirror 2026-01-16 00:14:05 +00:00
89d2741cba Merge branch 'master' into issue-183-nvshmem-install-fix
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
3b50b77ed3 fix base_loss to report correct loss by switching the dataloader to the new default
Compare 45 commits »
tacit synced commits to refs/pull/311/merge at tacit/nanochat from mirror 2026-01-16 00:14:04 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »
tacit synced commits to refs/pull/296/merge at tacit/nanochat from mirror 2026-01-16 00:14:04 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »
tacit synced commits to refs/pull/258/head at tacit/nanochat from mirror 2026-01-16 00:14:04 +00:00
28b7dae0c3 restore
7cd3992f74 Merge branch 'master' into feat/add_dataset_progress_bar
d3679cd0a8 reverting change to .gitignore to prevent merge conflict
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 106 commits »
tacit synced commits to refs/pull/258/merge at tacit/nanochat from mirror 2026-01-16 00:14:04 +00:00
28b7dae0c3 restore
7cd3992f74 Merge branch 'master' into feat/add_dataset_progress_bar
d3679cd0a8 reverting change to .gitignore to prevent merge conflict
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
Compare 40 commits »
tacit synced commits to refs/pull/151/merge at tacit/nanochat from mirror 2026-01-16 00:14:03 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
Compare 3 commits »
tacit synced commits to refs/pull/204/merge at tacit/nanochat from mirror 2026-01-16 00:14:03 +00:00
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
3b50b77ed3 fix base_loss to report correct loss by switching the dataloader to the new default
Compare 10 commits »
tacit synced commits to refs/pull/141/merge at tacit/nanochat from mirror 2026-01-16 00:14:03 +00:00
65865df300 Merge branch 'master' into master_goderr
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
Compare 11 commits »
tacit synced commits to master at tacit/nanochat from mirror 2026-01-16 00:14:02 +00:00
bdcc030ffa oops legacy spurious line now
22a71aa3d3 fuse adamw into a single torch compiled kernel similar to muon. it's about 1.7X faster, but overall it's so tiny that it's not making a major dent
255f8b9af6 cleanly separate cpu and gpu sections
Compare 3 commits »
tacit synced commits to refs/pull/141/head at tacit/nanochat from mirror 2026-01-16 00:14:02 +00:00
65865df300 Merge branch 'master' into master_goderr
6bb92403d5 changes and optimizations to muon, making it more efficient and simpler/cleaner a bit
3142ca1a28 minor helpful message
7312ec9898 fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way
3b50b77ed3 fix base_loss to report correct loss by switching the dataloader to the new default
Compare 44 commits »