nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-03-04 16:30:28 +00:00

History

Artemis Git Integration a381fc406d refactor(chat_sft): use uncompiled model for eval and checkpointing to prevent recompilation Use orig_model instead of compiled model for engine init, MMLU/ARC-Easy eval, and checkpoint saving to avoid recompilation on variable-length inputs		2025-11-05 16:09:43 +00:00
..
base_eval.py	initial commit	2025-10-13 06:49:24 -07:00
base_loss.py	initial commit	2025-10-13 06:49:24 -07:00
base_train.py	initial commit	2025-10-13 06:49:24 -07:00
benchmark_optimizations.py	feat(benchmark): add performance benchmark script for KV-cache optimizations with CLI args, GPU memory tracking, and statistical measurement across iterations	2025-11-03 10:06:02 +00:00
chat_cli.py	initial commit	2025-10-13 06:49:24 -07:00
chat_eval.py	initial commit	2025-10-13 06:49:24 -07:00
chat_rl.py	initial commit	2025-10-13 06:49:24 -07:00
chat_sft.py	refactor(chat_sft): use uncompiled model for eval and checkpointing to prevent recompilation	2025-11-05 16:09:43 +00:00
chat_web.py	also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate	2025-10-16 01:28:37 +00:00
mid_train.py	fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68 . also add --dry_run option useful for experimentation	2025-10-15 16:35:04 +00:00
tok_eval.py	initial commit	2025-10-13 06:49:24 -07:00
tok_train.py	initial commit	2025-10-13 06:49:24 -07:00