nanochat/dev
2026-01-17 12:27:30 -08:00
..
estimate_gpt3_core.ipynb add notebook on deriving the CORE estimates for the GPT-3 miniseries. 2026-01-05 18:40:28 +00:00
gen_synthetic_data.py sane secrets management 2026-01-04 19:29:22 +00:00
generate_logo.html initial commit 2025-10-13 06:49:24 -07:00
LOG.md brief update to log 2026-01-17 00:25:50 +00:00
nanochat.png Update logo 2025-10-14 14:19:44 -04:00
repackage_data_reference.py initial commit 2025-10-13 06:49:24 -07:00
runcpu.sh update the CPU/MPS script to give reasonable results. The model can at least answer that Paris is the capital of France and knows that the sky is blue, for about 40 minutes of training on my macbook. Also fixed a bug that existed due to KVCache bfloat16 dtype assumption 2026-01-17 12:27:30 -08:00
scaling_analysis.ipynb add notebook used for scaling laws analysis 2026-01-07 22:28:53 +00:00