nanochat/dev
2026-01-11 21:49:54 +00:00
..
estimate_gpt3_core.ipynb add notebook on deriving the CORE estimates for the GPT-3 miniseries. 2026-01-05 18:40:28 +00:00
gen_synthetic_data.py sane secrets management 2026-01-04 19:29:22 +00:00
generate_logo.html initial commit 2025-10-13 06:49:24 -07:00
LOG.md add alternating window size patterns for the GPT layers, following GPT-3. Experimented a bit and found the pattern SSSL to work well - 3 short, 1 long alternating. This is now the new default and the plots look quite a bit better on flops vs. bpb 2026-01-11 21:49:54 +00:00
nanochat.png Update logo 2025-10-14 14:19:44 -04:00
repackage_data_reference.py initial commit 2025-10-13 06:49:24 -07:00
runcpu.sh remove rust compilation as rustbpe is now installed from separate package (#416) 2026-01-08 06:18:37 -08:00
scaling_analysis.ipynb add notebook used for scaling laws analysis 2026-01-07 22:28:53 +00:00