• Joined on 2024-05-31
tacit synced and deleted reference refs/tags/refs/pull/51/merge at tacit/nanochat from mirror 2025-10-15 19:02:13 +00:00
tacit synced commits to refs/pull/19/merge at tacit/nanochat from mirror 2025-10-15 19:02:13 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »
tacit synced commits to refs/pull/18/merge at tacit/nanochat from mirror 2025-10-15 19:02:13 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
49870bb540 Merge branch 'master' into launch-with-skypilot
67aaca98f5 export NANOCHAT_BASE_DIR so child processes get it too
Compare 6 commits »
tacit synced commits to refs/pull/18/head at tacit/nanochat from mirror 2025-10-15 19:02:13 +00:00
49870bb540 Merge branch 'master' into launch-with-skypilot
67aaca98f5 export NANOCHAT_BASE_DIR so child processes get it too
f0855cbcc7 Update speedrun.sh
dd6ff9a1cc fix bug in fallback case of find_largest_model
afaa5b4c90 Fix: Handle missing d<number> model tags in find_largest_model
Compare 7 commits »
tacit synced commits to refs/pull/17/merge at tacit/nanochat from mirror 2025-10-15 19:02:13 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »
tacit synced commits to refs/pull/15/merge at tacit/nanochat from mirror 2025-10-15 19:02:13 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »
tacit synced commits to refs/pull/13/merge at tacit/nanochat from mirror 2025-10-15 19:02:13 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »
tacit synced commits to master at tacit/nanochat from mirror 2025-10-15 19:02:13 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 2 commits »
tacit synced commits to refs/pull/51/merge at tacit/nanochat from mirror 2025-10-15 10:52:14 +00:00
bfd8d21313 Update architecture diagram in nanochat_architecture.jpg
2167425dba Merge branch 'karpathy:master' into master
9ff69c99b9 Replace architecture diagram in README with a new JPG format image and remove the old PNG file.
Compare 5 commits »
tacit synced commits to refs/pull/21/head at tacit/nanochat from mirror 2025-10-15 10:52:13 +00:00
837b43a504 feat: support mps
1c5dd2b7ba fix: get safe autocast dtype
Compare 2 commits »
tacit synced commits to refs/pull/51/head at tacit/nanochat from mirror 2025-10-15 10:52:13 +00:00
bfd8d21313 Update architecture diagram in nanochat_architecture.jpg
2167425dba Merge branch 'karpathy:master' into master
9ff69c99b9 Replace architecture diagram in README with a new JPG format image and remove the old PNG file.
67aaca98f5 export NANOCHAT_BASE_DIR so child processes get it too
Compare 6 commits »
tacit synced commits to refs/pull/50/merge at tacit/nanochat from mirror 2025-10-15 10:52:13 +00:00
90e3bc778b indent fix, limited device options to cuda/mps
Compare 2 commits »
tacit synced commits to refs/pull/50/head at tacit/nanochat from mirror 2025-10-15 10:52:13 +00:00
90e3bc778b indent fix, limited device options to cuda/mps
tacit synced commits to refs/pull/21/merge at tacit/nanochat from mirror 2025-10-15 10:52:13 +00:00
837b43a504 feat: support mps
1c5dd2b7ba fix: get safe autocast dtype
Compare 3 commits »
tacit synced and deleted reference refs/tags/refs/pull/52/merge at tacit/nanochat from mirror 2025-10-15 10:52:13 +00:00
tacit synced commits to refs/pull/35/merge at tacit/nanochat from mirror 2025-10-15 02:42:15 +00:00
67aaca98f5 export NANOCHAT_BASE_DIR so child processes get it too
f0855cbcc7 Update speedrun.sh
Compare 3 commits »
tacit synced commits to refs/pull/6/merge at tacit/nanochat from mirror 2025-10-15 02:42:15 +00:00
67aaca98f5 export NANOCHAT_BASE_DIR so child processes get it too
f0855cbcc7 Update speedrun.sh
Compare 3 commits »
tacit synced commits to refs/pull/49/merge at tacit/nanochat from mirror 2025-10-15 02:42:15 +00:00
67aaca98f5 export NANOCHAT_BASE_DIR so child processes get it too
f0855cbcc7 Update speedrun.sh
Compare 3 commits »
tacit synced commits to refs/pull/43/merge at tacit/nanochat from mirror 2025-10-15 02:42:15 +00:00
67aaca98f5 export NANOCHAT_BASE_DIR so child processes get it too
f0855cbcc7 Update speedrun.sh
Compare 3 commits »
tacit synced commits to refs/pull/40/merge at tacit/nanochat from mirror 2025-10-15 02:42:15 +00:00
67aaca98f5 export NANOCHAT_BASE_DIR so child processes get it too
f0855cbcc7 Update speedrun.sh
Compare 3 commits »