• Joined on 2024-05-31
tacit synced commits to refs/pull/106/merge at tacit/nanochat from mirror 2025-10-21 22:02:15 +00:00
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
50bea28ef9 also add readme mention of the cpu mps changes
Compare 26 commits »
tacit synced commits to refs/pull/103/merge at tacit/nanochat from mirror 2025-10-21 22:02:15 +00:00
bb71c64579 fix silly issue in dataloader, this version is much faster and more portable to mps too
c9ea7a91e2 Add customization instructions to README
03cddd9878 actually let's not brick code on git pull. change error to warning
fe5aed940b add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok
Compare 5 commits »
tacit synced commits to refs/pull/105/merge at tacit/nanochat from mirror 2025-10-21 22:02:15 +00:00
c9ea7a91e2 Add customization instructions to README
03cddd9878 actually let's not brick code on git pull. change error to warning
fe5aed940b add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok
Compare 4 commits »
tacit synced commits to refs/pull/126/merge at tacit/nanochat from mirror 2025-10-21 22:02:15 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced commits to refs/pull/122/merge at tacit/nanochat from mirror 2025-10-21 22:02:15 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
Compare 28 commits »
tacit synced and deleted reference refs/tags/refs/pull/127/merge at tacit/nanochat from mirror 2025-10-21 22:02:14 +00:00
tacit synced commits to master at tacit/nanochat from mirror 2025-10-21 22:02:14 +00:00
2e938530ce delete spurious torch.empty allocation in adamw
a088b7a6ec use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
94ee507054 quick fix base eval due to fewshot requirement
33e8a27f91 Merge karpathy/cpu-mps-dev , adding the ability to run on CPU, on MPS, or on CUDA, with autodetect. Gnarly PR, nonzero chance I broke something.
50bea28ef9 also add readme mention of the cpu mps changes
Compare 27 commits »
tacit synced commits to cpu-mps-dev at tacit/nanochat from mirror 2025-10-21 22:02:14 +00:00
50bea28ef9 also add readme mention of the cpu mps changes
5bdc99abfb merge and resolve conflict
dfcb1c16f1 Merge branch 'master' into cpu-mps-dev
bb71c64579 fix silly issue in dataloader, this version is much faster and more portable to mps too
bb786c5560 i shouldnt have committed the lock file, i missed that. revert to the flagship build which is linux. sorry to pollute the repo history...
Compare 16 commits »
tacit synced and deleted reference refs/tags/refs/pull/88/merge at tacit/nanochat from mirror 2025-10-21 22:02:14 +00:00
tacit synced and deleted reference refs/tags/refs/pull/137/merge at tacit/nanochat from mirror 2025-10-21 22:02:14 +00:00
tacit synced and deleted reference refs/tags/refs/pull/123/merge at tacit/nanochat from mirror 2025-10-21 22:02:14 +00:00
tacit synced and deleted reference refs/tags/refs/pull/130/merge at tacit/nanochat from mirror 2025-10-21 13:52:13 +00:00
tacit synced commits to refs/pull/39/merge at tacit/nanochat from mirror 2025-10-21 13:52:13 +00:00
0f007889dd Add MIT License as a file to the project
5a879f4947 export NANOCHAT_BASE_DIR so child processes get it too
c1d2ed1c13 use orig_model in sampling, silly of me to miss this
2bc521a6de use orig_model in sampling, silly of me to miss this
Compare 7 commits »
tacit synced and deleted reference refs/tags/refs/pull/108/merge at tacit/nanochat from mirror 2025-10-21 05:42:13 +00:00
tacit synced commits to refs/pull/122/head at tacit/nanochat from mirror 2025-10-20 21:32:15 +00:00
4ed203a3ab remove transformers from toml. add it to gh Workflow. copy common.py from cpu|mps branch to check if gh wf tests are passing
tacit synced commits to refs/pull/88/head at tacit/nanochat from mirror 2025-10-20 21:32:15 +00:00
2e9669e03a upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming
a09ac812ed toml changes for cpu only install
0abb0fa2e3 add both sides of the source check
c7ae920a77 add check for linux on cpu
Compare 4 commits »
tacit synced commits to refs/pull/6/merge at tacit/nanochat from mirror 2025-10-20 21:32:15 +00:00
0f007889dd Add MIT License as a file to the project
5a879f4947 export NANOCHAT_BASE_DIR so child processes get it too
c1d2ed1c13 use orig_model in sampling, silly of me to miss this
2bc521a6de use orig_model in sampling, silly of me to miss this
Compare 7 commits »
tacit synced commits to refs/pull/122/merge at tacit/nanochat from mirror 2025-10-20 21:32:15 +00:00
4ed203a3ab remove transformers from toml. add it to gh Workflow. copy common.py from cpu|mps branch to check if gh wf tests are passing
Compare 2 commits »
tacit synced commits to refs/pull/108/merge at tacit/nanochat from mirror 2025-10-20 21:32:15 +00:00
a09ac812ed toml changes for cpu only install
0abb0fa2e3 add both sides of the source check
c7ae920a77 add check for linux on cpu
Compare 4 commits »
tacit synced commits to cpu-mps-dev at tacit/nanochat from mirror 2025-10-20 21:32:15 +00:00
2e9669e03a upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming
a09ac812ed toml changes for cpu only install
0abb0fa2e3 add both sides of the source check
c7ae920a77 add check for linux on cpu
Compare 4 commits »