nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-03-11 19:55:31 +00:00

History

Andrej Karpathy 43c29dd9d5 Big DataLoader refactor: BOS-aligned dataloaders with epoch tracking for pre/mid-training The new DataLoader ensures that every token sequence in train/val batches has a BOS token at the beginning. Therefore, no token streams start abruptly in the middle of a document, which could be confusing for the model. Note that this changes the loss scale because there are fewer confusing tokens in the train/val batches. The main downside is that we now waste about 35% of tokens due to cropping. This is ok because we have a lot of data. See dev/LOG.md entry for this change for a lot more information.		2026-01-13 20:05:47 +00:00
..
estimate_gpt3_core.ipynb	add notebook on deriving the CORE estimates for the GPT-3 miniseries.	2026-01-05 18:40:28 +00:00
gen_synthetic_data.py	sane secrets management	2026-01-04 19:29:22 +00:00
generate_logo.html	initial commit	2025-10-13 06:49:24 -07:00
LOG.md	Big DataLoader refactor: BOS-aligned dataloaders with epoch tracking for pre/mid-training	2026-01-13 20:05:47 +00:00
nanochat.png	Update logo	2025-10-14 14:19:44 -04:00
repackage_data_reference.py	initial commit	2025-10-13 06:49:24 -07:00
runcpu.sh	remove rust compilation as rustbpe is now installed from separate package (#416 )	2026-01-08 06:18:37 -08:00
scaling_analysis.ipynb	add notebook used for scaling laws analysis	2026-01-07 22:28:53 +00:00