mirror of
https://github.com/karpathy/nanochat.git
synced 2026-05-09 09:20:04 +00:00
Default inference_default_max_tokens 512->1024 in chat-api and in modal/serve.py default. Hard cap in modal raised 2048->4096. Fixes mid-sentence cutoffs on longer (esp. thinking-mode) answers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| auth | ||
| chat-api | ||
| frontend | ||
| inference | ||