Commit Graph

8 Commits

Author SHA1 Message Date
google-labs-jules[bot]
51927a9e60 feat: Add comprehensive end-to-end documentation
This commit introduces extensive documentation across the entire nanochat codebase. The goal is to make the project more accessible, educational, and easier for new contributors to understand.

Key additions include:
- A new "Codebase Overview and Data Flow" section in the main README.md, providing a high-level guide to the project structure and training pipeline.
- Detailed, educational docstrings and inline comments in all Python modules within the `nanochat/`, `scripts/`, and `tasks/` directories.
- Explanations of the rationale and implementation details for key components, including Python equivalents for non-Python code where applicable.
- A new `README.md` in the `rustbpe/` directory explaining the BPE algorithm and the decision to use Rust.
- Comprehensive comments in shell scripts and development scripts in the `dev/` directory, clarifying their purpose and usage.
2025-11-24 12:57:49 +00:00
karpathy
2e9669e03a upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming 2025-10-20 10:15:17 -07:00
Andrej Karpathy
4346536ab2 also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate 2025-10-16 01:28:37 +00:00
Andrej Karpathy
4c3590c499 fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints 2025-10-15 20:29:54 +00:00
Andrej Karpathy
03fa673b7d add basic logging to chat_web, which i think might be fun 2025-10-15 19:51:06 +00:00
Andrej Karpathy
52bfeea8bd add very basic abuse prevention limits to chat_web so it's ok to host endpoints 2025-10-15 19:42:54 +00:00
Andrej Karpathy
01fb290f53 allow multiple GPUs to do inference in a data parallel way 2025-10-15 19:12:19 +00:00
karpathy
3a5e0bc50b initial commit 2025-10-13 06:49:24 -07:00