Andrej Karpathy
a445144d39
create a group for dev dependencies, there is no need to install all this other stuff just for speedrun and it's exposing people to dependency chain attacks. we need to delete more dependencies. dependencies bad bad bad
2026-03-26 03:41:28 +00:00
Andrej Karpathy
03be953668
delete non-essential deps from legacy use
2026-03-26 03:41:28 +00:00
Andrej Karpathy
e569b59f92
delete torchao dependency, create our own exact API-matched version of Float8Linear, document it very well. for some poorly understood reason, the performance is not only ~identical but actually runs 3% faster. despite of it being significantly simpler and much less code. i don't fully understand why/how atm
2026-02-10 18:46:39 +00:00
Andrej Karpathy
6079f78fc3
add fp8 training with torchao
2026-02-03 21:03:42 +00:00
Andrej Karpathy
7d1700c521
add zstd lib
2026-01-16 00:44:01 +00:00
Andrej Karpathy
2ff7d51252
integrate Flash Attention 3. +9% tok_per_sec for d12 with ctx even as low as 2048 out of the box nice. also, ready to tune windows huge
2026-01-11 20:33:19 +00:00
Andrej Karpathy
ccf4b7f9bf
nudge hyperparameters of the base script with the results of the sweeps and miniseries. vocab size down to 32K. D:N ratio from 20 to 8. add miniseries script
2026-01-07 22:11:59 +00:00
Andrej Karpathy
eec0c79563
also add matplotlib dep so that we can have jupyter notebooks
2026-01-05 18:41:09 +00:00
Andrej Karpathy
962b6bfba3
alright add transformers as a dep of the repo because it should be easy to evaluate the CORE score of HF models. Not super happy about it but i tried it and the uv.lock doesn't get bloated as much as i expected
2026-01-04 20:37:28 +00:00
Andrej Karpathy
ed2082fbc4
sane secrets management
2026-01-04 19:29:22 +00:00
Andrej Karpathy
9c60dfb64c
bump nanochat to use the latest stable pytorch that is 2.9.1 . Run e.g. to re-update your local environment if you git pull
2026-01-04 18:36:36 +00:00
Andrej Karpathy
ee79f29fbd
replace files-to-prompt with git ls-files for bloat metrics
...
files-to-prompt was including untracked files (knowledge/, dev scripts, etc.) which inflated the bloat metrics. now we use git ls-files to only count tracked source files, which is more accurate and removes an external dependency.
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 01:38:15 +00:00
Andrej Karpathy
aa42f40e66
delete the inline rustbpe project. it was ugly to have a project within project and rustbpe is now nicely a separate repo on my github karpathy/rustbpe and it's on pypi etc., so we just add it as a depedency to uv. i think it is appropriate that this is a separate repo because 1) it doesn't have too many knobs, other than the ones that are exposed - the regex pattern and vocab size and 2) all of its complexity is not algorithmic (it's equivalent to minbpe), instead it is efficiency-related, so it is ok to hide relatively speaking
2026-01-03 23:55:28 +00:00
Ajeesh Sunil
5e0987a431
numpy isnt acting as a dependency for nanochat, so isnt it better to remove numpy from dependencies list
2025-10-28 20:05:38 +00:00
Luke Stanley
901b075605
Fix GPU-less CPU use on Linux with specific Torch indexes
2025-10-21 23:14:16 +00:00
burtenshaw
0abb0fa2e3
add both sides of the source check
2025-10-20 10:44:07 +02:00
burtenshaw
c7ae920a77
add check for linux on cpu
2025-10-20 06:51:52 +02:00
karpathy
e4f9b9c64d
revert to previous pyproject.toml
2025-10-17 08:08:16 -07:00
burtenshaw
23b6351c1c
add groups and source selection
2025-10-17 12:20:18 +02:00
karpathy
306bc380ab
add support for CPU and for MPS. I had to change a few cosmetic things. I also discovered I think a bit of a bug, where I was casting wte to bfloat16 in the wrong place (the model init) instead of in init_weights
2025-10-16 10:04:43 -07:00
karpathy
3a5e0bc50b
initial commit
2025-10-13 06:49:24 -07:00