nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-06-18 20:19:08 +00:00

Author	SHA1	Message	Date
Jason Kneen	e83d633179	Add training continuation script and update MacOS guide Introduces continue_training.sh to automatically resume interrupted training stages by detecting existing checkpoints and proceeding as needed. Updates README_MACOS.md with instructions and troubleshooting for using the new script, including manual continuation steps and improved guidance for memory, architecture, and performance issues.	2025-10-22 09:37:31 +01:00
Jason Kneen	b81d789992	Pass device batch size to base_loss script Added the --device_batch_size argument to the base_loss evaluation command in runmac_overnight.sh to ensure batch size is configurable during evaluation.	2025-10-22 09:29:46 +01:00
Jason Kneen	1225ddf00e	Add macOS memory-optimized training and documentation Introduces automatic memory detection and batch size optimization for Apple Silicon Macs in runcpu.sh and runmac_overnight.sh scripts. Adds a comprehensive README_MACOS.md with usage instructions, performance profiles, environment variable overrides, troubleshooting, and expected training times. Updates scripts to allow manual overrides and improve usability for various Mac configurations. Also switched python to arm64 for 2-3x improvement	2025-10-22 07:35:26 +01:00
Jason Kneen	3e184d343e	Improve Mac/MPS compatibility and device handling Added dev/runmac_overnight.sh for optimized Mac training. Updated device-specific logic throughout dataloader, GPT, Muon optimizer, and training scripts to avoid CUDA-only features on MPS/CPU (e.g., torch.compile, pin_memory, non_blocking, bfloat16). Relaxed torch version constraints in pyproject.toml and removed Linux/CUDA-specific PyTorch config for better macOS support.	2025-10-22 01:55:38 +01:00
Andrej Karpathy	5bdc99abfb	merge and resolve conflict	2025-10-21 17:19:10 +00:00
Andrej Karpathy	fe5aed940b	add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok	2025-10-21 15:04:58 +00:00
karpathy	2e9669e03a	upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming	2025-10-20 10:15:17 -07:00
karpathy	a53833d04f	add nanochat logo png	2025-10-13 06:59:59 -07:00
karpathy	3a5e0bc50b	initial commit	2025-10-13 06:49:24 -07:00

9 Commits