Andrej Karpathy
a445144d39
create a group for dev dependencies, there is no need to install all this other stuff just for speedrun and it's exposing people to dependency chain attacks. we need to delete more dependencies. dependencies bad bad bad
2026-03-26 03:41:28 +00:00
Andrej
7808dc7159
Merge pull request #595 from svlandeg/fix/typo
...
Small fixes
2026-03-25 14:40:25 -07:00
Andrej
a4ed96687b
Merge pull request #634 from 2bitbit/fix-docs-and-comments
...
fix: correct minor typos in help text, README, and comments
2026-03-25 14:31:49 -07:00
svlandeg
bd6e9c8d5f
fix numbering
2026-03-15 22:18:18 +01:00
svlandeg
02e865c2ab
Merge branch 'master' into fix/typo
2026-03-15 22:18:01 +01:00
Andrej Karpathy
1b1cc3c599
submit new time to GPT-2 leaderboard entry: 99 minutes
2026-03-14 17:15:01 +00:00
svlandeg
6405b26d24
Merge branch 'master' into fix/typo
2026-03-13 13:56:50 +01:00
2bitbit
2bb93b2ae4
fix: correct minor typos in help text, README, and comments
2026-03-12 17:03:26 +08:00
Andrej Karpathy
f068604948
new leaderboard entry coming from improvements of autoresearch round 1, time to gpt-2 from 2.02 hours to 1.80 hours
2026-03-10 06:26:39 +00:00
svlandeg
f8ff0439b9
two more small typos
2026-03-06 11:03:00 +01:00
Andrej Karpathy
1076f97059
delete autocast, an unnecessary thorn in my side, manage dtypes directly
2026-03-04 23:55:30 +00:00
Andrej Karpathy
4b4077425b
Document new Leaderboard entry congrats @ddudek for pointing out ClimbMix, time to GPT-2 is now 2.01 hours, down from 2.76 previously
2026-03-04 20:02:07 +00:00
Andrej Karpathy
96522798f1
docs docs docs
2026-02-05 20:27:07 +00:00
Andrej Karpathy
5fdd5cdb24
new leaderboard record via new auto-calculated optimal batch size. for d26 it is 1M, up from 0.5M that was default earlier
2026-02-05 20:11:32 +00:00
Sofie Van Landeghem
012da1a78b
Typo fixes ( #480 )
...
* small typo
* few more small fixes
* small fixes in leaderboard.md
2026-02-05 19:12:50 +01:00
Andrej Karpathy
75b302f331
fix hash commit on leaderboard and a paragraph clarification
2026-02-05 16:14:28 +00:00
Andrej Karpathy
fe55b092b8
minor cosmetics for the table
2026-02-03 21:05:28 +00:00
Andrej Karpathy
a67eba35dc
add feb2 new leaderboard record from upgrading to fp8 training, +4.3% speedup to time to GPT-2
2026-02-03 21:03:42 +00:00
Andrej Karpathy
0307997f9b
merge two files base_loss and base_eval into a single file, it's nicer this way, and unify the huggingface code associated with both
2026-02-01 02:36:43 +00:00
Andrej Karpathy
1ddaad1c1c
nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1
2026-01-31 19:12:25 +00:00
Andrei Panferov
4d8dbaf6e0
Fix escape character in README bibtex entry ( #454 )
2026-01-30 09:34:02 -08:00
Andrej Karpathy
02baa15405
i am feeling in a delete mood today. i need to delete a lot of code. there is too much code and surface area and complexity. ew
2026-01-30 17:08:53 +00:00
Andrej Karpathy
41bb2eac32
Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help
2026-01-29 00:52:08 +00:00
Andrej Karpathy
63bb5831e2
something i've wanted to do for a while - move all .sh runs to their own directory so they don't pollute root dir
2026-01-18 15:27:41 +00:00
Andrej Karpathy
6460dc6382
tweaks to readme a bit
2026-01-17 02:28:31 +00:00
Sofie Van Landeghem
d4ea28d4e2
Fix args in readme ( #438 )
...
* fix commands in readme, using new arg format
* fix typo
* add required -i flag to chat_eval example runs
2026-01-15 16:26:38 -08:00
Andrej Karpathy
4cc605b940
quick pointer to miniseries post in readme for now
2026-01-07 22:14:21 +00:00
Andrej Karpathy
eb7bbc1b66
delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts
2026-01-04 19:14:23 +00:00
Andrej Karpathy
da8b7ea4cb
also delete the rustbpe test code, this now lives in rustbpe repo that is separate
2026-01-04 01:23:34 +00:00
Andrej Karpathy
aa42f40e66
delete the inline rustbpe project. it was ugly to have a project within project and rustbpe is now nicely a separate repo on my github karpathy/rustbpe and it's on pypi etc., so we just add it as a depedency to uv. i think it is appropriate that this is a separate repo because 1) it doesn't have too many knobs, other than the ones that are exposed - the regex pattern and vocab size and 2) all of its complexity is not algorithmic (it's equivalent to minbpe), instead it is efficiency-related, so it is ok to hide relatively speaking
2026-01-03 23:55:28 +00:00
Hossein-Lakzaei
8c89661465
Update README to match current d34 demo ( #314 ) ( #381 )
...
* Update README: switch hosted model description from d32 to d34 per discussion #314
* link to discussion thread
* parameter in quotes
---------
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2025-12-30 10:17:11 +01:00
Andrej
4763ce612a
Small fixes to typos
2025-11-14 07:25:59 -08:00
svlandeg
e5efb4b471
add test_engine.py to file structure
2025-11-14 11:13:42 +01:00
Andrej Karpathy
9a71d13688
typo oops
2025-11-13 16:08:30 +00:00
Andrej Karpathy
7b7fd0fe71
thank you Sophie for your help with nanochat
2025-11-13 16:07:54 +00:00
Andrej Karpathy
f15732524a
make deepwiki link better
2025-11-01 14:13:29 +00:00
svlandeg
0a3ce7b0ff
typo fixes in readme
2025-10-28 20:11:00 +01:00
Andrej Karpathy
9415931f85
delete czar call for help, i'm working through the inbound on that now. add current LLM policy which just asks for disclosure atm
2025-10-28 15:17:43 +00:00
Andrej Karpathy
c75fe54aa7
readme tweak, link to new discussion and add file structure
2025-10-25 19:39:16 +00:00
Andrej Karpathy
5eeb2b6ef9
experiment: looking to 'hire' a nanochat repo czar to help the repo, mentioning in readme
2025-10-22 16:55:54 +00:00
Andrej Karpathy
50bea28ef9
also add readme mention of the cpu mps changes
2025-10-21 17:24:48 +00:00
Andrej
c9ea7a91e2
Add customization instructions to README
...
Added a section on customization for nanochat.
2025-10-21 08:57:10 -07:00
Andrej Karpathy
d6d86cbf4c
update readme with a link to the CPU|MPS branch
2025-10-16 22:03:39 +00:00
Andrej Karpathy
ccfe7915ac
mention the current d32 chat hosted on nanochat.karpathy.ai, as an example endpoint of the repo
2025-10-16 19:32:44 +00:00
Enes Poyraz
6a795baf27
Update README.md
...
fix typos
2025-10-13 18:40:12 +02:00
Andrej
626bd3e260
Add image of the WebUI to readme
2025-10-13 08:03:00 -07:00
karpathy
da96b46565
update link to the new discussion
2025-10-13 07:42:09 -07:00
karpathy
a53833d04f
add nanochat logo png
2025-10-13 06:59:59 -07:00
karpathy
3a5e0bc50b
initial commit
2025-10-13 06:49:24 -07:00