Commit Graph

21 Commits

Author SHA1 Message Date
Andrej Karpathy
a445144d39 create a group for dev dependencies, there is no need to install all this other stuff just for speedrun and it's exposing people to dependency chain attacks. we need to delete more dependencies. dependencies bad bad bad 2026-03-26 03:41:28 +00:00
Andrej Karpathy
03be953668 delete non-essential deps from legacy use 2026-03-26 03:41:28 +00:00
Andrej Karpathy
e569b59f92 delete torchao dependency, create our own exact API-matched version of Float8Linear, document it very well. for some poorly understood reason, the performance is not only ~identical but actually runs 3% faster. despite of it being significantly simpler and much less code. i don't fully understand why/how atm 2026-02-10 18:46:39 +00:00
Andrej Karpathy
6079f78fc3 add fp8 training with torchao 2026-02-03 21:03:42 +00:00
Andrej Karpathy
7d1700c521 add zstd lib 2026-01-16 00:44:01 +00:00
Andrej Karpathy
2ff7d51252 integrate Flash Attention 3. +9% tok_per_sec for d12 with ctx even as low as 2048 out of the box nice. also, ready to tune windows huge 2026-01-11 20:33:19 +00:00
Andrej Karpathy
ccf4b7f9bf nudge hyperparameters of the base script with the results of the sweeps and miniseries. vocab size down to 32K. D:N ratio from 20 to 8. add miniseries script 2026-01-07 22:11:59 +00:00
Andrej Karpathy
eec0c79563 also add matplotlib dep so that we can have jupyter notebooks 2026-01-05 18:41:09 +00:00
Andrej Karpathy
962b6bfba3 alright add transformers as a dep of the repo because it should be easy to evaluate the CORE score of HF models. Not super happy about it but i tried it and the uv.lock doesn't get bloated as much as i expected 2026-01-04 20:37:28 +00:00
Andrej Karpathy
ed2082fbc4 sane secrets management 2026-01-04 19:29:22 +00:00
Andrej Karpathy
9c60dfb64c bump nanochat to use the latest stable pytorch that is 2.9.1 . Run e.g. to re-update your local environment if you git pull 2026-01-04 18:36:36 +00:00
Andrej Karpathy
ee79f29fbd replace files-to-prompt with git ls-files for bloat metrics
files-to-prompt was including untracked files (knowledge/, dev scripts, etc.) which inflated the bloat metrics. now we use git ls-files to only count tracked source files, which is more accurate and removes an external dependency.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 01:38:15 +00:00
Andrej Karpathy
aa42f40e66 delete the inline rustbpe project. it was ugly to have a project within project and rustbpe is now nicely a separate repo on my github karpathy/rustbpe and it's on pypi etc., so we just add it as a depedency to uv. i think it is appropriate that this is a separate repo because 1) it doesn't have too many knobs, other than the ones that are exposed - the regex pattern and vocab size and 2) all of its complexity is not algorithmic (it's equivalent to minbpe), instead it is efficiency-related, so it is ok to hide relatively speaking 2026-01-03 23:55:28 +00:00
Ajeesh Sunil
5e0987a431 numpy isnt acting as a dependency for nanochat, so isnt it better to remove numpy from dependencies list 2025-10-28 20:05:38 +00:00
Luke Stanley
901b075605 Fix GPU-less CPU use on Linux with specific Torch indexes 2025-10-21 23:14:16 +00:00
burtenshaw
0abb0fa2e3 add both sides of the source check 2025-10-20 10:44:07 +02:00
burtenshaw
c7ae920a77 add check for linux on cpu 2025-10-20 06:51:52 +02:00
karpathy
e4f9b9c64d revert to previous pyproject.toml 2025-10-17 08:08:16 -07:00
burtenshaw
23b6351c1c add groups and source selection 2025-10-17 12:20:18 +02:00
karpathy
306bc380ab add support for CPU and for MPS. I had to change a few cosmetic things. I also discovered I think a bit of a bug, where I was casting wte to bfloat16 in the wrong place (the model init) instead of in init_weights 2025-10-16 10:04:43 -07:00
karpathy
3a5e0bc50b initial commit 2025-10-13 06:49:24 -07:00