nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-03-21 12:23:13 +00:00

History

William Thurston 8a6d34daf7 Add kv_head_mult parameter for training and evaluation scripts - Introduced `kv_head_mult` to control the number of query heads sharing a key/value head in `base_train.py`, `mid_train.py`, and `runmps.sh`. - Updated logging to include global token per second metrics during training. - Added assertions to ensure `kv_head_mult` is valid and properly integrated into model calculations.		2025-11-09 14:23:45 -08:00
..
gen_synthetic_data.py	add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok	2025-10-21 15:04:58 +00:00
generate_logo.html	initial commit	2025-10-13 06:49:24 -07:00
nanochat.png	add nanochat logo png	2025-10-13 06:59:59 -07:00
repackage_data_reference.py	initial commit	2025-10-13 06:49:24 -07:00
runcpu.sh	Add scripts for running evaluations and training with W&B integration	2025-11-05 11:49:50 -08:00
runmps_evals.sh	Add scripts for running evaluations and training with W&B integration	2025-11-05 11:49:50 -08:00
runmps.sh	Add kv_head_mult parameter for training and evaluation scripts	2025-11-09 14:23:45 -08:00