mirror of https://github.com/karpathy/nanochat.git synced 2026-05-08 16:59:59 +00:00

History

Manmohan Sharma 8153a4fadf feat(chat-api): conversation orchestration + SSE streaming proxy (#6 ) - FastAPI service that manages conversations and messages in PostgreSQL (SQLAlchemy 2.0 async + asyncpg) and streams assistant responses back to the client via sse-starlette, forwarding the inference service SSE contract unchanged. - Auth guard validates every request against the auth service /auth/validate endpoint (X-Internal-API-Key) and caches results in an in-process TTL cache (5 min, 1024 entries) to absorb request bursts. - Every query filters by authenticated user_id; cross-user access returns 404. Message send flow auto-titles the first message, persists the streamed assistant response after the client disconnects, and records token_count + inference_time_ms. - /api/models{,/swap} proxies the inference admin surface; swap requires is_admin on the validated user. - Structured JSON logging via structlog with trace_id + user_id ContextVars attached to every log line. - Test suite (pytest + aiosqlite + respx) covers CRUD, user scoping, streaming SSE persistence, regenerate, model proxy admin gate, and the stream proxy error path. 16/16 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-16 11:49:51 -07:00
..
src	feat(chat-api): conversation orchestration + SSE streaming proxy (#6 )	2026-04-16 11:49:51 -07:00
Dockerfile	feat(chat-api): conversation orchestration + SSE streaming proxy (#6 )	2026-04-16 11:49:51 -07:00
pyproject.toml	feat(chat-api): conversation orchestration + SSE streaming proxy (#6 )	2026-04-16 11:49:51 -07:00
README.md	feat(chat-api): conversation orchestration + SSE streaming proxy (#6 )	2026-04-16 11:49:51 -07:00

README.md

Chat API Service

Orchestration layer for samosaChaat conversations. Manages conversation state in PostgreSQL, authenticates every request via the auth service, and proxies streaming inference requests via Server-Sent Events.

Endpoints

Method	Path	Description
GET	`/api/health`	Liveness probe (unauthenticated)
GET	`/api/conversations`	List the authenticated user's conversations, grouped by date
POST	`/api/conversations`	Create a new conversation
GET	`/api/conversations/{id}`	Fetch a conversation + full message history
PUT	`/api/conversations/{id}`	Update the conversation title
DELETE	`/api/conversations/{id}`	Delete a conversation (cascade deletes messages)
POST	`/api/conversations/{id}/messages`	Append a user message and stream the assistant response
POST	`/api/conversations/{id}/regenerate`	Delete the last assistant message and regenerate it
GET	`/api/models`	Proxy to inference `GET /models`
POST	`/api/models/swap`	Proxy to inference `POST /models/swap` (admin only)

All authenticated endpoints expect Authorization: Bearer <jwt>. The chat API validates the token by calling the auth service POST /auth/validate with the shared X-Internal-API-Key header and caches the result for 5 minutes.

Environment

Variable	Default	Purpose
`DATABASE_URL`	`postgresql+asyncpg://localhost/samosachaat`	PostgreSQL connection string
`AUTH_SERVICE_URL`	`http://auth:8001`	Base URL of the auth service
`INFERENCE_SERVICE_URL`	`http://inference:8000`	Base URL of the inference service
`INTERNAL_API_KEY`	—	Shared key for internal service auth
`MAX_CONVERSATION_HISTORY`	`50`	Max messages included in each inference call
`MAX_TOKEN_BUDGET`	`6000`	Character budget proxy for the above
`FRONTEND_URL`	`http://localhost:3000`	Origin allowed by CORS
`LOG_LEVEL`	`INFO`	Python log level

Running locally

uv pip install -e ".[dev]"
uvicorn src.main:app --reload --port 8002

Running tests

cd services/chat-api
pytest

Tests use SQLite + aiosqlite for a throwaway database, respx to mock the auth service, and hand-crafted httpx mocks for the inference SSE stream.