nanochat/modal
Manmohan Sharma 7a92f5b016
fix(serve): detect tool markers in text stream not token ids
The SFT loader tokenizes assistant content with .encode() (ordinary), not .encode_special(), so the model was trained to emit <|python_start|> / <|python_end|> as the 7-token ordinary sequence [60, 124, 25145, 95, 17104, 124, 62] rather than as special token id 32764. My prior state-machine matched token_id == python_start_id, which never fired — so tool calls were never executed and the model just hallucinated fake tool results (Official leadership page etc). Fix: detect markers in the decoded text stream, parse the payload between <|python_start|> and <|python_end|>, execute the tool, inject the real <|output_start|>…<|output_end|> tokens into both the SSE stream and the model's input_ids. Next-token prediction is now grounded on real Tavily output.
2026-04-22 14:39:36 -07:00
..
_model.py feat: deploy d24 SFT + polished UI redesign with dark mode (#39) 2026-04-16 19:55:16 -04:00
_tokenizer.py feat: deploy d24 SFT + polished UI redesign with dark mode (#39) 2026-04-16 19:55:16 -04:00
_tools.py fix(tools): enable Tavily include_answer and fix UI overflow 2026-04-22 14:20:47 -07:00
serve.py fix(serve): detect tool markers in text stream not token ids 2026-04-22 14:39:36 -07:00