nanochat/modal
Manmohan Sharma 57be688fdc
fix(serve): don't scan our own injected tokens for the loop-break check
Bug: after runtime tool injection, the post-injection break scanned gen_ids[pre_injection_len:] which included our own injected <|output_start|>…<|output_end|> — so the loop-break fired IMMEDIATELY and stopped the turn before the model could write its final answer. Visible on multi-turn queries like a follow-up 'tell me more about him' where the model naturally issued a tool call, got real Tavily output, and then got cut off. Fix: track post_injection_start (the index AFTER injected tokens) and only scan from there for stray markers.
2026-04-22 15:15:34 -07:00
..
_model.py feat: deploy d24 SFT + polished UI redesign with dark mode (#39) 2026-04-16 19:55:16 -04:00
_query_classifier.py fix(tools): force web_search on tool-worthy queries + strip orphan markers in UI 2026-04-22 15:01:07 -07:00
_tokenizer.py feat: deploy d24 SFT + polished UI redesign with dark mode (#39) 2026-04-16 19:55:16 -04:00
_tools.py fix(tools): enable Tavily include_answer and fix UI overflow 2026-04-22 14:20:47 -07:00
serve.py fix(serve): don't scan our own injected tokens for the loop-break check 2026-04-22 15:15:34 -07:00