nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-03-20 03:43:20 +00:00

History

William Thurston 0c942a8c00 Add tie_embeddings support and configurable logging interval Implement weight tying between token embeddings and lm_head to reduce parameter count. When enabled, logits are scaled by 1/√d_model, lm_head zeroing is skipped, and optimizer groups are deduplicated. Param counting uses unique parameters while Chinchilla ratio calculation adds back the would-be lm_head size for comparability. Also adds boolean flag parsing (--flag without =value) to the configurator, an auto-derived log_every interval, and minor shell script fixes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-02-22 14:42:58 -08:00
..
gen_synthetic_data.py	add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok	2025-10-21 15:04:58 +00:00
generate_logo.html	initial commit	2025-10-13 06:49:24 -07:00
nanochat.png	add nanochat logo png	2025-10-13 06:59:59 -07:00
repackage_data_reference.py	initial commit	2025-10-13 06:49:24 -07:00
runcpu.sh	Add scripts for running evaluations and training with W&B integration	2025-11-05 11:49:50 -08:00
runmps_evals.sh	Add scripts for running evaluations and training with W&B integration	2025-11-05 11:49:50 -08:00
runmps.sh	Add tie_embeddings support and configurable logging interval	2026-02-22 14:42:58 -08:00