• Joined on 2024-05-31
tacit synced commits to refs/pull/31/head at tacit/nanochat from mirror 2025-10-22 14:22:14 +00:00
6641aeed1d FEAT: Allow CPU-only execution in compute_init
tacit synced commits to refs/pull/24/merge at tacit/nanochat from mirror 2025-10-22 14:22:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
49cd02f283 fix: remove unnecessary tensor allocation in DistAdamW optimizer
Compare 5 commits »
tacit synced commits to refs/pull/15/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/145/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
67d76b834a tidy up and doc simplification
e83d633179 Add training continuation script and update MacOS guide
b81d789992 Pass device batch size to base_loss script
1225ddf00e Add macOS memory-optimized training and documentation
Compare 5 commits »
tacit synced commits to refs/pull/145/head at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
67d76b834a tidy up and doc simplification
e83d633179 Add training continuation script and update MacOS guide
b81d789992 Pass device batch size to base_loss script
1225ddf00e Add macOS memory-optimized training and documentation
Compare 4 commits »
tacit synced commits to refs/pull/13/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 25 commits »
tacit synced and deleted reference refs/tags/refs/pull/61/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced and deleted reference refs/tags/refs/pull/50/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced and deleted reference refs/tags/refs/pull/133/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced and deleted reference refs/tags/refs/pull/132/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced and deleted reference refs/tags/refs/pull/126/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced commits to refs/pull/94/merge at tacit/nanochat from mirror 2025-10-22 06:12:15 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
49cd02f283 fix: remove unnecessary tensor allocation in DistAdamW optimizer
Compare 3 commits »
tacit synced commits to refs/pull/91/head at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
067298c51b reverted back
tacit synced commits to refs/pull/90/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/64/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/46/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/18/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/142/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
7a52f9bfbb Updates lockfile with CPU package support without overwriting other architectures
760af62e11 Git ignore eval_bundle
901b075605 Fix GPU-less CPU use on Linux with specific Torch indexes
Compare 4 commits »
tacit synced commits to refs/pull/91/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
067298c51b reverted back
Compare 2 commits »
tacit synced commits to refs/pull/142/head at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
7a52f9bfbb Updates lockfile with CPU package support without overwriting other architectures
760af62e11 Git ignore eval_bundle
901b075605 Fix GPU-less CPU use on Linux with specific Torch indexes
Compare 3 commits »