Commit Graph

130 Commits

Author SHA1 Message Date
Andrej Karpathy
91f09ccd0d minor fix comment in engine 2025-11-13 15:28:18 +00:00
Andrej Karpathy
adb5d4a16c uv lock has to change when we removed numpy the other commit 2025-11-13 15:16:27 +00:00
Andrej Karpathy
c6b7ab7440 grad clip logging and printing and cosmetics 2025-11-05 21:08:30 +00:00
Andrej
885a4f25e7
Replace fcntl with filelock for Windows compatibility 2025-11-04 16:35:39 -08:00
Andrej
3a2ae631c4
Merge branch 'master' into master 2025-11-04 16:35:02 -08:00
Andrej
12d995f58c
Add NPROC_PER_NODE var to speedrun.sh and run1000.sh 2025-11-04 16:26:33 -08:00
svlandeg
f1683c5b16 set nproc_per_node as var in speedrun and run1000 scripts 2025-11-04 21:36:10 +01:00
Andrej
d1558c7873
handle bf16 on MPS by casting to fp32 during load checkpoint 2025-11-04 09:42:50 -08:00
Andrej
df25293087
Add explicit UTF-8 encoding on open 2025-11-04 09:38:18 -08:00
Yasser Makram
1e89af9862 Replace fcntl with filelock for Windows compatibility 2025-11-04 07:22:34 +00:00
Dipesh Babu
7a40ee77b4 fix: cast bf16 to fp32 on MPS (like CPU) to avoid dtype issues 2025-11-03 16:00:56 -05:00
svlandeg
2ce62ec076 ensure consistency of quotes within each statement 2025-11-03 21:52:02 +01:00
svlandeg
e22fc6f2fa few more explicit UTF-8 encodings 2025-11-03 21:46:39 +01:00
svlandeg
c72b8b2309 add explicit UTF-8 encoding 2025-11-03 21:27:12 +01:00
Andrej
a83646e098
fix(eval): use UTF-8 when reading CORE JSONL and writing CSV 2025-11-03 06:38:33 -08:00
Andrej
8681922328
fix lstrip bug, make it removeprefix, TIL. 2025-11-03 06:37:48 -08:00
Dipesh Babu
226953b841 fix: open JSONL and results CSV with UTF-8 encoding for portability 2025-11-03 01:20:56 -05:00
Josh Odom
f1e15f5f4d Fixing subtle bug: lstrip removes all matching characters, including potentially required ones. Use removeprefix instead. 2025-11-02 23:40:37 -06:00
Andrej
b6da6982f6
fix nanochat logo: the t was placed too far to the right 2025-11-02 08:17:00 -08:00
Andrej
c2c4f77e22
oops small bugfix to run1000.sh missing kwarg 2025-11-02 08:14:41 -08:00
Andrej
d1ac0b2d07
when loading models on CPU, convert tensors from bfloat16 to float 2025-11-02 07:58:56 -08:00
svlandeg
5bfcd31b73 revert more formatting changes 2025-11-02 14:17:10 +01:00
svlandeg
036a3c5881 revert formatting changes to facilitate review 2025-11-02 14:16:43 +01:00
Jing Zhang
ba4f40bf58
Update run1000.sh to add missing --run=$WANDB_RUN 2025-11-01 21:27:00 -07:00
Manuel Saelices
d54c9cbf8c CPU Support, as bfloat16 params breaks inference 2025-11-01 23:38:50 +01:00
Andrej Karpathy
cf587acb1a move eval bundle download to be lazy and inside the python code so that we can substantially simplify the run bash scripts 2025-11-01 16:04:38 +00:00
Andrej Karpathy
7d2c4a3d95 delete pandas dep in base_eval use csv instead 2025-11-01 15:28:30 +00:00
Andrej
ad39db5a23
tiny fix to comment
Update engine.py with correct error message on assert
2025-11-01 07:43:57 -07:00
Andrej
630f54ae5a
use empty locals and globals in call to eval() in engine tool use
harden eval: prevent the calc tool from accessing globals and locals
2025-11-01 07:22:59 -07:00
Andrej Karpathy
f15732524a make deepwiki link better 2025-11-01 14:13:29 +00:00
Andrej
dfc88334b6
fix tok/sec calculation bug when grad accum steps > 1
Fix tok/sec metrics for base_train and mid_train when gradient accumulation is not 1
2025-10-30 08:36:32 -07:00
Andrej
eb11bb0e2e
remove numpy as dep
Remove explicit numpy dependency
2025-10-30 08:28:14 -07:00
Andrej
1ccbaf4416
nit delete redundant catch/raise in execute
Remove redundant exception handling in chdir
2025-10-29 08:10:03 -07:00
Andrej
29ff38d94b
Merge pull request #35 from bhaskar0210s/master
fix: return inf instead of crashing when evaluate_bpb has zero total_bytes
2025-10-29 08:06:24 -07:00
svlandeg
b996131570 Merge branch 'master' into logo/kerning-update 2025-10-29 11:45:40 +01:00
svlandeg
3fa974f93c few more reverts 2025-10-29 11:45:02 +01:00
svlandeg
cbd560a83d revert formatting changes to minimize diff and merge conflicts 2025-10-29 11:42:56 +01:00
Andrej
a1de1f46ad
Merge pull request #156 from tlepoint/fix/export-base-dir
Export the base dir variable in runcpu.sh
2025-10-28 15:19:08 -07:00
Andrej
ee00f523d0
fixing all the typos to make the pull requests stop
Batch of typo fixes
2025-10-28 13:36:07 -07:00
Ajeesh Sunil
5e0987a431 numpy isnt acting as a dependency for nanochat, so isnt it better to remove numpy from dependencies list 2025-10-28 20:05:38 +00:00
svlandeg
8c9b004c99 typo fixes in scripts 2025-10-28 20:17:31 +01:00
svlandeg
0a3ce7b0ff typo fixes in readme 2025-10-28 20:11:00 +01:00
Andrej Karpathy
fdda5826e3 Merge branch 'haowei01-fix_kv_cache_due_to_resize' 2025-10-28 16:54:30 +00:00
Andrej Karpathy
baf0b3fdda also add a test that failed before the fix and passes now with the fix for kv cache resize 2025-10-28 16:54:17 +00:00
Andrej Karpathy
f1db6b4712 delete czar call for help, i'm working through the inbound on that now. add current LLM policy which just asks for disclosure atm 2025-10-28 16:51:41 +00:00
Andrej Karpathy
9415931f85 delete czar call for help, i'm working through the inbound on that now. add current LLM policy which just asks for disclosure atm 2025-10-28 15:17:43 +00:00
Haowei Zhang
2b9c085559 update the kv_shape 2025-10-27 02:47:13 -07:00
Haowei Zhang
b062b422ac Fix kv cache, given resize will destroys the logical structure 2025-10-27 02:23:08 -07:00
water-vapor
a9de4b1038 Fix tok/sec metrics for base_train and mid_train when gradient accumulation is not 1 2025-10-26 01:43:49 -05:00
Andrej Karpathy
c75fe54aa7 readme tweak, link to new discussion and add file structure 2025-10-25 19:39:16 +00:00