• Joined on 2024-05-31
tacit synced and deleted reference refs/tags/refs/pull/358/merge at tacit/nanochat from mirror 2025-12-09 05:42:23 +00:00
tacit synced and deleted reference refs/tags/refs/pull/351/merge at tacit/nanochat from mirror 2025-12-09 05:42:23 +00:00
tacit synced commits to refs/pull/309/head at tacit/nanochat from mirror 2025-12-09 05:42:23 +00:00
8b1cecaa95 Apply suggestion from @svlandeg for nicer looking comparison
tacit synced commits to refs/pull/256/merge at tacit/nanochat from mirror 2025-12-09 05:42:23 +00:00
cbf30c842c apply float32 cast before logits softcapping so the tanh is in fp32. torch compile fuses this correctly with no extra memory costs.
90442de35f fix bug where any rank has to be able to create checkpoint_dir if saving optim
2fd0440355 fix: missing val_bpb on resume
16788eed3c fix(model): apply float32 cast before logits softcapping
Compare 6 commits »
tacit synced and deleted reference refs/tags/refs/pull/345/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/342/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/327/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/326/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/325/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/317/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/310/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/309/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/308/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/307/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced and deleted reference refs/tags/refs/pull/306/merge at tacit/nanochat from mirror 2025-12-09 05:42:22 +00:00
tacit synced commits to refs/pull/361/merge at tacit/nanochat from mirror 2025-12-08 21:32:17 +00:00
2fd0440355 fix: missing val_bpb on resume
53b3a4fb81 fix: missing val_bpb on resume
Compare 3 commits »
tacit synced commits to refs/pull/358/merge at tacit/nanochat from mirror 2025-12-08 21:32:17 +00:00
90442de35f fix bug where any rank has to be able to create checkpoint_dir if saving optim
2fd0440355 fix: missing val_bpb on resume
53b3a4fb81 fix: missing val_bpb on resume
Compare 4 commits »
tacit synced commits to refs/pull/351/merge at tacit/nanochat from mirror 2025-12-08 21:32:17 +00:00
2fd0440355 fix: missing val_bpb on resume
53b3a4fb81 fix: missing val_bpb on resume
Compare 3 commits »
tacit synced commits to refs/pull/333/merge at tacit/nanochat from mirror 2025-12-08 21:32:17 +00:00
90442de35f fix bug where any rank has to be able to create checkpoint_dir if saving optim
2fd0440355 fix: missing val_bpb on resume
53b3a4fb81 fix: missing val_bpb on resume
Compare 4 commits »
tacit synced commits to refs/pull/328/merge at tacit/nanochat from mirror 2025-12-08 21:32:16 +00:00
2fd0440355 fix: missing val_bpb on resume
53b3a4fb81 fix: missing val_bpb on resume
Compare 3 commits »