Commit Graph

  • e27337909d Add setup section and improve test documentation Rimom Costa 2025-10-14 14:00:22 +0100
  • 95ad16fb20
    Rephrasing tests section Rimom Costa 2025-10-14 13:56:08 +0100
  • 919ea572b0 add modular imports to the init burtenshaw 2025-10-14 14:55:03 +0200
  • 4f61309a45 GitButler Workspace Commit GitButler 2025-10-14 17:29:00 +0500
  • 03e4bc8e68 docs: Remove comma before "that" in README pipeline sentence Usman 2025-10-14 17:48:07 +0500
  • 557b2d5840 feat(engine.py): Sample unique tokens per row in generation stream Azekowka 2025-10-14 17:33:47 +0500
  • 9a08bb4edb Reapply "Refactor: Improve DDP detection in common.py" Azekowka 2025-10-14 17:09:04 +0500
  • 1a45e5b78a Revert "Refactor: Improve DDP detection in common.py" Azekowka 2025-10-14 17:04:41 +0500
  • b724190f2a Refactor: Improve DDP detection in common.py Azekowka 2025-10-14 16:59:57 +0500
  • dd6812c83e Refactor: Remove pandas dependency from base_eval.py Azekowka 2025-10-14 16:52:26 +0500
  • 02440f670d fix: return inf instead of crashing when evaluate_bpb has zero total_bytes Bhaskar 2025-10-14 17:21:11 +0530
  • 8ff4dae67c
    correct subject-verb agreement in README.md CharlesCNorton 2025-10-14 07:47:02 -0400
  • 641b5b18d7 Also add no_sync to SFT Training Aflah 2025-10-14 13:45:15 +0200
  • c17b57fea1 Add no_sync to mid_train Aflah 2025-10-14 13:42:28 +0200
  • 6971dde830 Skip sync when not at last step of grad accum Aflah 2025-10-14 13:34:24 +0200
  • f5d35391db use pyarrow.fs to download parquet files from the huggingface hub Krisztian Szucs 2025-10-14 13:28:36 +0200
  • b70da6d907 refactor: Pre-allocate larger KVCache to improve performance SyedaAnshrahGillani 2025-10-14 16:10:48 +0500
  • 6c6c1c2e67 refactor: Harden use_calculator against potential eval exploits SyedaAnshrahGillani 2025-10-14 16:03:37 +0500
  • 144db24d5f Add unit tests for RustBPE implementation Matt Suiche 2025-10-14 12:36:59 +0200
  • 85a9e0790b
    export env variable in speedrun.sh burtenshaw 2025-10-14 12:29:12 +0200
  • af8f6435df
    Merge branch 'karpathy:master' into master Rimom Costa 2025-10-14 09:51:45 +0100
  • 33142f99d3 Mount volume to persist data broqdev 2025-10-14 00:25:04 -0700
  • 31db19ae77
    Merge pull request #1 from LokiMetaSmith/fix/pytorch-memory-fragmentation Lawrence R Kincheloe III 2025-10-14 02:11:58 -0500
  • b3f662a924
    Merge branch 'rocm-support' into fix/pytorch-memory-fragmentation Lawrence R Kincheloe III 2025-10-14 02:11:50 -0500
  • b09f7fc29b Set PYTORCH_CUDA_ALLOC_CONF to prevent memory fragmentation google-labs-jules[bot] 2025-10-14 06:57:18 +0000
  • 5a785854d1 feat: Add HSA_OVERRIDE_GFX_VERSION for newer AMD GPUs google-labs-jules[bot] 2025-10-14 06:48:34 +0000
  • 0f8d1289db Add script for training using Modal broqdev 2025-10-13 23:20:20 -0700
  • 19fa71d6e5 fix: Resolve HIP error and improve device detection google-labs-jules[bot] 2025-10-14 06:07:13 +0000
  • 24ed569055 fuse qkv linear and qk rotary + norm Matthew Murphy 2025-10-13 22:55:55 -0700
  • 054d903cae fix: Address runtime errors and improve configuration google-labs-jules[bot] 2025-10-14 05:47:26 +0000
  • f20d9d4d3c docs: Update README with computing environment details google-labs-jules[bot] 2025-10-14 05:18:12 +0000
  • 08c628cb83 feat: Add ROCm and device-agnostic support google-labs-jules[bot] 2025-10-14 05:07:30 +0000
  • 662ff7eb7a feat: dynamic dtype selection Kirk Lin 2025-10-14 12:22:57 +0800
  • 447567634c feat: cross-platform support for CPU and GPU environments Kirk Lin 2025-10-14 12:08:44 +0800
  • 1240e1299e
    Update gpt.py azuhanel 2025-10-13 23:12:33 -0400
  • 1af51511b5 update Alexander Kim 2025-10-13 21:35:06 -0400
  • 56e432e92f Merge branch 'launch-with-skypilot' of github.com:alex000kim/nanochat into launch-with-skypilot Alexander Kim 2025-10-13 21:26:38 -0400
  • 2285738ca1 update README Alexander Kim 2025-10-13 21:26:20 -0400
  • 8dc0ad92c2
    Update README.md Alex Kim 2025-10-13 21:19:14 -0400
  • 4127586c33 add skypilot instructions Alexander Kim 2025-10-13 21:16:11 -0400
  • 6ef9f77789
    Merge pull request #1 from guangyusong/fix/tests-urllib-skip guangyusong 2025-10-13 20:50:40 -0400
  • 24b4e79eba tests: replace requests with urllib and skip on network failure in enwik8 fixture guangyusong 2025-10-13 20:40:38 -0400
  • de6eac26be Support training with only CPU, coded with help of Devin DeepWiki planner, GPT-5 Codex execution with VS Code agent mode. Luke Stanley 2025-10-14 00:39:03 +0000
  • e27d2da3d9 Add Torch CPU package index support Makes torch index an extra in pyproject.toml, speedrun.sh selects GPU index if supported, with CPU fallback Luke Stanley 2025-10-14 00:32:50 +0000
  • 4780462e65 Add docs/ directory to .gitignore xrliAnnie 2025-10-13 16:31:33 -0700
  • c3c9f6553f Add .idea/ to gitignore xrliAnnie 2025-10-13 16:12:58 -0700
  • 07f193ab0d Prevents OOM crashes in production web server from unbounded cache growth:D jaberjaber23 2025-10-14 01:57:29 +0300
  • 8aca98777a Docs: Link directly to DeepWiki URL for repo Luke Stanley 2025-10-13 22:34:20 +0000
  • dd6ff9a1cc
    fix bug in fallback case of find_largest_model Andrej 2025-10-13 14:38:34 -0700
  • afaa5b4c90 Fix: Handle missing d<number> model tags in find_largest_model Mirza-Samad-Ahmed-Baig 2025-10-14 00:18:20 +0300
  • 6d51049077
    introduce lr schedulers and tests Vilhelm Toivonen 2025-10-13 22:13:37 +0300
  • 47f7ffa25d
    wire up fused qkv and glu toggles Vilhelm Toivonen 2025-10-13 22:05:53 +0300
  • 992e73b055 Fix test_dataloader.py to test actual dataloader implementation Rimom Costa 2025-10-13 19:39:10 +0100
  • 44764ffff0 test: add comprehensive test suite with 66 passing tests Rimom Costa 2025-10-13 19:18:30 +0100
  • b230ab8a0b Add 'For Students' section with structured learning path through the codebase Rimom Costa 2025-10-13 18:50:35 +0100
  • 5fd0b13886
    Merge pull request #2 from epoyraz/patch-1 Andrej 2025-10-13 10:10:15 -0700
  • 18590307ae Upgrade to pyo3 0.26, fix warnings, remove unsafe usage Alex Gaynor 2025-10-13 12:45:30 -0400
  • 6a795baf27
    Update README.md Enes Poyraz 2025-10-13 18:40:12 +0200
  • 626bd3e260
    Add image of the WebUI to readme Andrej 2025-10-13 08:03:00 -0700
  • da96b46565 update link to the new discussion karpathy 2025-10-13 07:42:09 -0700
  • a53833d04f add nanochat logo png karpathy 2025-10-13 06:59:59 -0700
  • 3a5e0bc50b initial commit karpathy 2025-10-13 06:49:24 -0700