mirror of
https://github.com/karpathy/nanochat.git
synced 2026-05-07 16:30:11 +00:00
Fixes #581. When conversation_tokens grows beyond model.config.sequence_len, engine.generate() received a zero-dimension tensor and crashed with a matmul shape error. Add a sliding window guard before each generate() call that keeps the most recent (sequence_len - max_new_tokens) tokens, re-inserts bos to preserve a well-formed sequence, and notifies the user when truncation occurs. |
||
|---|---|---|
| .. | ||
| base_eval.py | ||
| base_train.py | ||
| chat_cli.py | ||
| chat_eval.py | ||
| chat_rl.py | ||
| chat_sft.py | ||
| chat_web.py | ||
| tok_eval.py | ||
| tok_train.py | ||