nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-03-05 00:40:29 +00:00

Author	SHA1	Message	Date
Artemis Git Integration	5cd79225c4	feat(train): enable torch.compile for chat_sft with fixed shapes for 30-50% speedup	2025-11-05 16:07:54 +00:00
Artemis Git Integration	d8be015b20	feat(chat_sft): add fixed-length padding for torch.compile compatibility Replace variable-length padding with fixed 2048-token padding to create constant batch shapes, enabling efficient torch.compile in subsequent training steps	2025-11-05 16:04:26 +00:00
Artemis Git Integration	4d9d10abb0	feat(benchmark): add performance benchmark script for KV-cache optimizations with CLI args, GPU memory tracking, and statistical measurement across iterations	2025-11-03 10:06:02 +00:00
Andrej Karpathy	4346536ab2	also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate	2025-10-16 01:28:37 +00:00
Andrej Karpathy	4c3590c499	fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints	2025-10-15 20:29:54 +00:00
Andrej Karpathy	03fa673b7d	add basic logging to chat_web, which i think might be fun	2025-10-15 19:51:06 +00:00
Andrej Karpathy	52bfeea8bd	add very basic abuse prevention limits to chat_web so it's ok to host endpoints	2025-10-15 19:42:54 +00:00
Andrej Karpathy	01fb290f53	allow multiple GPUs to do inference in a data parallel way	2025-10-15 19:12:19 +00:00
Andrej Karpathy	190d9515d0	dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports	2025-10-15 16:42:23 +00:00
Andrej Karpathy	b8076dd367	fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68 . also add --dry_run option useful for experimentation	2025-10-15 16:35:04 +00:00
karpathy	3a5e0bc50b	initial commit	2025-10-13 06:49:24 -07:00

11 Commits