nanochat/dev
Dustin Loring 5b27c0c59e Create convert_to_sharegpt.py
added a convert script for convert the current format of the idenitiy conversation for mid training to be compatiable with huggingface so there will be no need for the s3 one anymore
2026-03-06 11:20:10 -05:00
..
convert_to_sharegpt.py Create convert_to_sharegpt.py 2026-03-06 11:20:10 -05:00
estimate_gpt3_core.ipynb add notebook on deriving the CORE estimates for the GPT-3 miniseries. 2026-01-05 18:40:28 +00:00
gen_synthetic_data.py tune the synthetic data generation script. delete the king andrej stuff lol. also, upgrade to gemini 3 2026-02-02 01:45:59 +00:00
generate_logo.html initial commit 2025-10-13 06:49:24 -07:00
LEADERBOARD.md Document new Leaderboard entry congrats @ddudek for pointing out ClimbMix, time to GPT-2 is now 2.01 hours, down from 2.76 previously 2026-03-04 20:02:07 +00:00
LOG.md delete autocast, an unnecessary thorn in my side, manage dtypes directly 2026-03-04 23:55:30 +00:00
nanochat.png Update logo 2025-10-14 14:19:44 -04:00
repackage_data_reference.py document the legacy fineweb100b dataset and the new climbmix400b dataset 2026-03-03 17:24:31 +00:00
scaling_analysis.ipynb add engram-lite, add log, tune scaling laws analysis scripts 2026-01-27 22:31:17 +00:00
scaling_laws_jan26.png nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1 2026-01-31 19:12:25 +00:00