nanochat/services/chat-api/README.md
Manmohan Sharma 8153a4fadf
feat(chat-api): conversation orchestration + SSE streaming proxy (#6)
- FastAPI service that manages conversations and messages in PostgreSQL
  (SQLAlchemy 2.0 async + asyncpg) and streams assistant responses back
  to the client via sse-starlette, forwarding the inference service SSE
  contract unchanged.
- Auth guard validates every request against the auth service
  /auth/validate endpoint (X-Internal-API-Key) and caches results in an
  in-process TTL cache (5 min, 1024 entries) to absorb request bursts.
- Every query filters by authenticated user_id; cross-user access
  returns 404. Message send flow auto-titles the first message,
  persists the streamed assistant response after the client disconnects,
  and records token_count + inference_time_ms.
- /api/models{,/swap} proxies the inference admin surface; swap
  requires is_admin on the validated user.
- Structured JSON logging via structlog with trace_id + user_id
  ContextVars attached to every log line.
- Test suite (pytest + aiosqlite + respx) covers CRUD, user scoping,
  streaming SSE persistence, regenerate, model proxy admin gate,
  and the stream proxy error path. 16/16 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 11:49:51 -07:00

2.2 KiB

Chat API Service

Orchestration layer for samosaChaat conversations. Manages conversation state in PostgreSQL, authenticates every request via the auth service, and proxies streaming inference requests via Server-Sent Events.

Endpoints

Method Path Description
GET /api/health Liveness probe (unauthenticated)
GET /api/conversations List the authenticated user's conversations, grouped by date
POST /api/conversations Create a new conversation
GET /api/conversations/{id} Fetch a conversation + full message history
PUT /api/conversations/{id} Update the conversation title
DELETE /api/conversations/{id} Delete a conversation (cascade deletes messages)
POST /api/conversations/{id}/messages Append a user message and stream the assistant response
POST /api/conversations/{id}/regenerate Delete the last assistant message and regenerate it
GET /api/models Proxy to inference GET /models
POST /api/models/swap Proxy to inference POST /models/swap (admin only)

All authenticated endpoints expect Authorization: Bearer <jwt>. The chat API validates the token by calling the auth service POST /auth/validate with the shared X-Internal-API-Key header and caches the result for 5 minutes.

Environment

Variable Default Purpose
DATABASE_URL postgresql+asyncpg://localhost/samosachaat PostgreSQL connection string
AUTH_SERVICE_URL http://auth:8001 Base URL of the auth service
INFERENCE_SERVICE_URL http://inference:8000 Base URL of the inference service
INTERNAL_API_KEY Shared key for internal service auth
MAX_CONVERSATION_HISTORY 50 Max messages included in each inference call
MAX_TOKEN_BUDGET 6000 Character budget proxy for the above
FRONTEND_URL http://localhost:3000 Origin allowed by CORS
LOG_LEVEL INFO Python log level

Running locally

uv pip install -e ".[dev]"
uvicorn src.main:app --reload --port 8002

Running tests

cd services/chat-api
pytest

Tests use SQLite + aiosqlite for a throwaway database, respx to mock the auth service, and hand-crafted httpx mocks for the inference SSE stream.