mirror of https://github.com/karpathy/nanochat.git synced 2026-05-08 08:49:53 +00:00

Manmohan Sharma 8153a4fadf

feat(chat-api): conversation orchestration + SSE streaming proxy (#6 )

- FastAPI service that manages conversations and messages in PostgreSQL
  (SQLAlchemy 2.0 async + asyncpg) and streams assistant responses back
  to the client via sse-starlette, forwarding the inference service SSE
  contract unchanged.
- Auth guard validates every request against the auth service
  /auth/validate endpoint (X-Internal-API-Key) and caches results in an
  in-process TTL cache (5 min, 1024 entries) to absorb request bursts.
- Every query filters by authenticated user_id; cross-user access
  returns 404. Message send flow auto-titles the first message,
  persists the streamed assistant response after the client disconnects,
  and records token_count + inference_time_ms.
- /api/models{,/swap} proxies the inference admin surface; swap
  requires is_admin on the validated user.
- Structured JSON logging via structlog with trace_id + user_id
  ContextVars attached to every log line.
- Test suite (pytest + aiosqlite + respx) covers CRUD, user scoping,
  streaming SSE persistence, regenerate, model proxy admin gate,
  and the stream proxy error path. 16/16 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-16 11:49:51 -07:00

2.2 KiB

Raw Blame History

Chat API Service

Orchestration layer for samosaChaat conversations. Manages conversation state in PostgreSQL, authenticates every request via the auth service, and proxies streaming inference requests via Server-Sent Events.

Endpoints

Method	Path	Description
GET	`/api/health`	Liveness probe (unauthenticated)
GET	`/api/conversations`	List the authenticated user's conversations, grouped by date
POST	`/api/conversations`	Create a new conversation
GET	`/api/conversations/{id}`	Fetch a conversation + full message history
PUT	`/api/conversations/{id}`	Update the conversation title
DELETE	`/api/conversations/{id}`	Delete a conversation (cascade deletes messages)
POST	`/api/conversations/{id}/messages`	Append a user message and stream the assistant response
POST	`/api/conversations/{id}/regenerate`	Delete the last assistant message and regenerate it
GET	`/api/models`	Proxy to inference `GET /models`
POST	`/api/models/swap`	Proxy to inference `POST /models/swap` (admin only)

All authenticated endpoints expect Authorization: Bearer <jwt>. The chat API validates the token by calling the auth service POST /auth/validate with the shared X-Internal-API-Key header and caches the result for 5 minutes.

Environment

Variable	Default	Purpose
`DATABASE_URL`	`postgresql+asyncpg://localhost/samosachaat`	PostgreSQL connection string
`AUTH_SERVICE_URL`	`http://auth:8001`	Base URL of the auth service
`INFERENCE_SERVICE_URL`	`http://inference:8000`	Base URL of the inference service
`INTERNAL_API_KEY`	—	Shared key for internal service auth
`MAX_CONVERSATION_HISTORY`	`50`	Max messages included in each inference call
`MAX_TOKEN_BUDGET`	`6000`	Character budget proxy for the above
`FRONTEND_URL`	`http://localhost:3000`	Origin allowed by CORS
`LOG_LEVEL`	`INFO`	Python log level

Running locally

uv pip install -e ".[dev]"
uvicorn src.main:app --reload --port 8002

Running tests

cd services/chat-api
pytest

Tests use SQLite + aiosqlite for a throwaway database, respx to mock the auth service, and hand-crafted httpx mocks for the inference SSE stream.

2.2 KiB Raw Blame History