nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-06-18 20:19:08 +00:00

Author	SHA1	Message	Date
Andrej	a1de1f46ad	Merge pull request #156 from tlepoint/fix/export-base-dir Export the base dir variable in runcpu.sh	2025-10-28 15:19:08 -07:00
svlandeg	8c9b004c99	typo fixes in scripts	2025-10-28 20:17:31 +01:00
Tancrède Lepoint	d5cda11ab8	Export the base dir variable	2025-10-22 18:15:02 -04:00
Jason Kneen	67d76b834a	tidy up and doc simplification	2025-10-22 11:07:22 +01:00
Jason Kneen	e83d633179	Add training continuation script and update MacOS guide Introduces continue_training.sh to automatically resume interrupted training stages by detecting existing checkpoints and proceeding as needed. Updates README_MACOS.md with instructions and troubleshooting for using the new script, including manual continuation steps and improved guidance for memory, architecture, and performance issues.	2025-10-22 09:37:31 +01:00
Jason Kneen	b81d789992	Pass device batch size to base_loss script Added the --device_batch_size argument to the base_loss evaluation command in runmac_overnight.sh to ensure batch size is configurable during evaluation.	2025-10-22 09:29:46 +01:00
Jason Kneen	1225ddf00e	Add macOS memory-optimized training and documentation Introduces automatic memory detection and batch size optimization for Apple Silicon Macs in runcpu.sh and runmac_overnight.sh scripts. Adds a comprehensive README_MACOS.md with usage instructions, performance profiles, environment variable overrides, troubleshooting, and expected training times. Updates scripts to allow manual overrides and improve usability for various Mac configurations. Also switched python to arm64 for 2-3x improvement	2025-10-22 07:35:26 +01:00
Jason Kneen	3e184d343e	Improve Mac/MPS compatibility and device handling Added dev/runmac_overnight.sh for optimized Mac training. Updated device-specific logic throughout dataloader, GPT, Muon optimizer, and training scripts to avoid CUDA-only features on MPS/CPU (e.g., torch.compile, pin_memory, non_blocking, bfloat16). Relaxed torch version constraints in pyproject.toml and removed Linux/CUDA-specific PyTorch config for better macOS support.	2025-10-22 01:55:38 +01:00
Luke Stanley	901b075605	Fix GPU-less CPU use on Linux with specific Torch indexes	2025-10-21 23:14:16 +00:00
Andrej Karpathy	94ee507054	quick fix base eval due to fewshot requirement	2025-10-21 17:56:08 +00:00
Andrej Karpathy	5bdc99abfb	merge and resolve conflict	2025-10-21 17:19:10 +00:00
Andrej Karpathy	fe5aed940b	add personality to nanochat. breaks previous code on git pull and requires download of a new file from s3, but there is a helpful error message so hopefully its ok	2025-10-21 15:04:58 +00:00
karpathy	2e9669e03a	upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming	2025-10-20 10:15:17 -07:00
obxium	938cb31f1a	Update logo	2025-10-14 14:19:44 -04:00
karpathy	a53833d04f	add nanochat logo png	2025-10-13 06:59:59 -07:00
karpathy	3a5e0bc50b	initial commit	2025-10-13 06:49:24 -07:00

1 2

66 Commits