nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-03-13 16:33:41 +00:00

Author	SHA1	Message	Date
William Thurston	b1d49aade5	Add scripts for running evaluations and training with W&B integration - Added `dev/runmps_evals.sh` for evaluating checkpoints and logging results to W&B. - Introduced `dev/runmps.sh` for orchestrating training stages with W&B support. - Updated `.gitignore` to include `wandb/` and `.runmps_wandb_ids`. - Changed permissions for `dev/runcpu.sh` and added executable flag. - Enhanced existing scripts to log metrics to W&B during training and evaluation processes.	2025-11-05 11:49:50 -08:00
Andrej Karpathy	8892470f29	add the SpellingBee task so that nanochat can count r in strawberry etc. along the way we had to add a bunch of new functionality, e.g. extend the calculator to support the count function of python. possibly the current TaskMixture uses way too many synthetic examples of SpellingBee because the eval gives us exactly 100% performance on spelling. We can tune this later to reclaim some wall clock time here I think	2025-10-24 14:02:48 +00:00
Andrej Karpathy	5bdc99abfb	merge and resolve conflict	2025-10-21 17:19:10 +00:00
Andrej Karpathy	fe5aed940b	add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok	2025-10-21 15:04:58 +00:00
karpathy	2e9669e03a	upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming	2025-10-20 10:15:17 -07:00
Andrej Karpathy	190d9515d0	dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports	2025-10-15 16:42:23 +00:00
karpathy	3a5e0bc50b	initial commit	2025-10-13 06:49:24 -07:00

7 Commits