mirror of
https://github.com/karpathy/nanochat.git
synced 2026-03-04 16:30:28 +00:00
Replace variable-length padding with fixed 2048-token padding to create constant batch shapes, enabling efficient torch.compile in subsequent training steps |
||
|---|---|---|
| .. | ||
| base_eval.py | ||
| base_loss.py | ||
| base_train.py | ||
| benchmark_optimizations.py | ||
| chat_cli.py | ||
| chat_eval.py | ||
| chat_rl.py | ||
| chat_sft.py | ||
| chat_web.py | ||
| mid_train.py | ||
| tok_eval.py | ||
| tok_train.py | ||