The best ChatGPT that $100 can buy.
Go to file
2026-03-05 12:58:27 +08:00
.claude/skills/read-arxiv-paper add arxiv reading skill 2026-01-29 00:34:24 +00:00
dev document the legacy fineweb100b dataset and the new climbmix400b dataset 2026-03-03 17:24:31 +00:00
nanochat update: speedrundiy.sh 流程跑通 2026-03-05 12:36:07 +08:00
runs chore: 优化训练脚本 2026-03-05 12:58:27 +08:00
scripts fix: 修复 chat_sft.py log 函数 section 参数传递错误问题 2026-03-05 12:46:30 +08:00
tasks remove leftover mid references (#491) 2026-02-02 08:33:46 -08:00
tests Fix MockModel's device definition (#535) 2026-02-17 16:03:46 -08:00
.env update: speedrundiy.sh 流程跑通 2026-03-05 12:36:07 +08:00
.gitignore update: speedrundiy.sh 流程跑通 2026-03-05 12:36:07 +08:00
.python-version update: speedrundiy.sh 流程跑通 2026-03-05 12:36:07 +08:00
LICENSE Add MIT License as a file to the project 2025-10-19 17:22:19 -07:00
pyproject.toml update: speedrundiy.sh 流程跑通 2026-03-05 12:36:07 +08:00
README.md update: speedrundiy.sh 流程跑通 2026-03-05 12:36:07 +08:00
uv.lock update: speedrundiy.sh 流程跑通 2026-03-05 12:36:07 +08:00

sh

  • python -m nanochat.report reset
  • python -m scripts.tok_train --max_chars=2000000000
  • python -m scripts.tok_eval
  • torchrun --standalone --nproc_per_node=1 -m scripts.base_train -- --depth=18 --device-batch-size=1
  • torchrun --standalone --nproc_per_node=1 -m scripts.base_eval -- --device-batch-size=1
  • torchrun --standalone --nproc_per_node=1 -m scripts.chat_sft -- --device-batch-size=1
  • torchrun --standalone --nproc_per_node=1 -m scripts.chat_eval -- -i sft
  • python -m scripts.chat_cli -p "Why is the sky blue?"
  • python -m scripts.chat_web
  • python -m nanochat.report generate