nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-04-02 13:45:21 +00:00

Author	SHA1	Message	Date
Filip	70d0abe432	Fix a few cosmetic wandb issues: 1. Update the pinned `wandb` library version. The old version raises when given new `wandb` API keys! 2. Move the `step` argument to the right place in `wandb.log` calls. The signature is `wandb.log(data: dict, step: int, commit: bool)` - previously, step counts were being included in the data dict, meaning wandb metrics incorrectly had x-axes corresponding to the number of calls to `.log` instead of the number of training steps. 3. Move `wandb.init` later in `chat_sft.py` and `base_train.py` to include config values that are calculated or read from a checkpoint.	2026-03-10 15:45:11 -04:00
Andrej Karpathy	1076f97059	delete autocast, an unnecessary thorn in my side, manage dtypes directly	2026-03-04 23:55:30 +00:00
Sofie Van Landeghem	72b9064f9d	remove leftover mid references (#491 )	2026-02-02 08:33:46 -08:00
Andrej Karpathy	41bb2eac32	Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help	2026-01-29 00:52:08 +00:00
Haoyu Wang	50413d2d67	typo in comments: change "GAPO" to "DAPO"	2026-01-15 22:03:42 -08:00
Andrej Karpathy	7312ec9898	fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way	2026-01-13 22:45:27 +00:00
Adria Blancafort	1b5de29e71	Fix undefined variable in chat_rl after recent refactor * Fix undefined variable * Remove unused import Remove unused import 're' from chat_rl.py	2026-01-07 09:08:57 -08:00
Andrej Karpathy	eb7bbc1b66	delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts	2026-01-04 19:14:23 +00:00
DU Wenjie	ea4229851b	bugfix	2025-12-26 19:02:12 +08:00
svlandeg	8c9b004c99	typo fixes in scripts	2025-10-28 20:17:31 +01:00
karpathy	3a5e0bc50b	initial commit	2025-10-13 06:49:24 -07:00

11 Commits