nanochat/services
Manmohan 3ab89e7890
feat: deploy d24-sft-r6 with full reasoning mode + live tool use (Tavily)
Model R6 (97% pass rate on 33-probe eval, val_bpb 0.2635):
- modal/serve.py + modal/_tools.py: tool-aware streaming with
  TavilySearchBackend auto-detect, python_start/end state machine,
  output_start/end forcing; mount tavily secret
- modal/serve.py: MODEL_TAG=d24-sft-r6, model path points at new SFT r6
- services/chat-api/routes/messages.py: accept thinking_mode flag,
  inject samosaChaat system prompt (direct or <think> variant) into
  first user message before streaming to Modal
- services/frontend/components/chat/ChatInput.tsx: Brain toggle
  'Think' button next to send; when active, model uses think mode
- services/frontend/components/chat/ChatWindow.tsx: track
  thinkingMode state, pass through to API body as thinking_mode
- services/frontend/components/chat/MessageBubble.tsx: parse and
  render <think>...</think> as collapsible italic blocks;
  <|python_start|>...<|python_end|> as tool-call cards with icons
  per tool name; <|output_start|>...<|output_end|> as result cards
  with expandable JSON
- nanochat/tools.py: TavilySearchBackend class + auto-detect
- nanochat/ui.html: legacy UI reasoning toggle (kept for parity)

Tool execution verified live: query -> web_search via Tavily ->
Macron returned with grounded answer.
2026-04-22 13:43:43 -07:00
..
auth Merge pull request #26 from manmohan659/fix/missing-models 2026-04-16 16:50:22 -04:00
chat-api feat: deploy d24-sft-r6 with full reasoning mode + live tool use (Tavily) 2026-04-22 13:43:43 -07:00
frontend feat: deploy d24-sft-r6 with full reasoning mode + live tool use (Tavily) 2026-04-22 13:43:43 -07:00
inference fix(inference): regenerate uv.lock after structlog/prometheus deps added 2026-04-16 12:49:05 -07:00