• Joined on 2024-05-31
tacit synced and deleted reference refs/tags/refs/pull/61/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced and deleted reference refs/tags/refs/pull/50/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced and deleted reference refs/tags/refs/pull/133/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced and deleted reference refs/tags/refs/pull/132/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced and deleted reference refs/tags/refs/pull/126/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
tacit synced commits to refs/pull/15/merge at tacit/nanochat from mirror 2025-10-22 14:22:13 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/94/merge at tacit/nanochat from mirror 2025-10-22 06:12:15 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
49cd02f283 fix: remove unnecessary tensor allocation in DistAdamW optimizer
Compare 3 commits »
tacit synced commits to refs/pull/91/head at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
067298c51b reverted back
tacit synced commits to refs/pull/90/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/64/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/46/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/18/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/91/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
067298c51b reverted back
Compare 2 commits »
tacit synced commits to refs/pull/142/head at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
7a52f9bfbb Updates lockfile with CPU package support without overwriting other architectures
760af62e11 Git ignore eval_bundle
901b075605 Fix GPU-less CPU use on Linux with specific Torch indexes
Compare 3 commits »
tacit synced commits to refs/pull/106/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
49cd02f283 fix: remove unnecessary tensor allocation in DistAdamW optimizer
Compare 3 commits »
tacit synced commits to refs/pull/142/merge at tacit/nanochat from mirror 2025-10-22 06:12:14 +00:00
7a52f9bfbb Updates lockfile with CPU package support without overwriting other architectures
760af62e11 Git ignore eval_bundle
901b075605 Fix GPU-less CPU use on Linux with specific Torch indexes
Compare 4 commits »
tacit synced commits to refs/pull/97/merge at tacit/nanochat from mirror 2025-10-21 22:02:20 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/98/merge at tacit/nanochat from mirror 2025-10-21 22:02:20 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/94/merge at tacit/nanochat from mirror 2025-10-21 22:02:20 +00:00
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
50bea28ef9 also add readme mention of the cpu mps changes
Compare 26 commits »
tacit synced commits to refs/pull/61/merge at tacit/nanochat from mirror 2025-10-21 22:02:19 +00:00
c9ea7a91e2 Add customization instructions to README
03cddd9878 actually let's not brick code on git pull. change error to warning
fe5aed940b add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok
Compare 4 commits »