nanochat/services/chat-api
2026-04-16 14:59:20 -07:00
..
src fix(chat-api): detect Modal URL by domain not path suffix 2026-04-16 14:59:20 -07:00
Dockerfile fix(docker): add structlog + prometheus deps to auth and chat-api Dockerfiles 2026-04-16 13:46:53 -07:00
pyproject.toml feat(observability): Prometheus + Grafana + Loki stack for samosaChaat (#9) 2026-04-16 12:29:16 -07:00
README.md feat(chat-api): conversation orchestration + SSE streaming proxy (#6) 2026-04-16 11:49:51 -07:00

Chat API Service

Orchestration layer for samosaChaat conversations. Manages conversation state in PostgreSQL, authenticates every request via the auth service, and proxies streaming inference requests via Server-Sent Events.

Endpoints

Method Path Description
GET /api/health Liveness probe (unauthenticated)
GET /api/conversations List the authenticated user's conversations, grouped by date
POST /api/conversations Create a new conversation
GET /api/conversations/{id} Fetch a conversation + full message history
PUT /api/conversations/{id} Update the conversation title
DELETE /api/conversations/{id} Delete a conversation (cascade deletes messages)
POST /api/conversations/{id}/messages Append a user message and stream the assistant response
POST /api/conversations/{id}/regenerate Delete the last assistant message and regenerate it
GET /api/models Proxy to inference GET /models
POST /api/models/swap Proxy to inference POST /models/swap (admin only)

All authenticated endpoints expect Authorization: Bearer <jwt>. The chat API validates the token by calling the auth service POST /auth/validate with the shared X-Internal-API-Key header and caches the result for 5 minutes.

Environment

Variable Default Purpose
DATABASE_URL postgresql+asyncpg://localhost/samosachaat PostgreSQL connection string
AUTH_SERVICE_URL http://auth:8001 Base URL of the auth service
INFERENCE_SERVICE_URL http://inference:8000 Base URL of the inference service
INTERNAL_API_KEY Shared key for internal service auth
MAX_CONVERSATION_HISTORY 50 Max messages included in each inference call
MAX_TOKEN_BUDGET 6000 Character budget proxy for the above
FRONTEND_URL http://localhost:3000 Origin allowed by CORS
LOG_LEVEL INFO Python log level

Running locally

uv pip install -e ".[dev]"
uvicorn src.main:app --reload --port 8002

Running tests

cd services/chat-api
pytest

Tests use SQLite + aiosqlite for a throwaway database, respx to mock the auth service, and hand-crafted httpx mocks for the inference SSE stream.