nanochat/services/chat-api/src/config.py
Manmohan Sharma 8153a4fadf
feat(chat-api): conversation orchestration + SSE streaming proxy (#6)
- FastAPI service that manages conversations and messages in PostgreSQL
  (SQLAlchemy 2.0 async + asyncpg) and streams assistant responses back
  to the client via sse-starlette, forwarding the inference service SSE
  contract unchanged.
- Auth guard validates every request against the auth service
  /auth/validate endpoint (X-Internal-API-Key) and caches results in an
  in-process TTL cache (5 min, 1024 entries) to absorb request bursts.
- Every query filters by authenticated user_id; cross-user access
  returns 404. Message send flow auto-titles the first message,
  persists the streamed assistant response after the client disconnects,
  and records token_count + inference_time_ms.
- /api/models{,/swap} proxies the inference admin surface; swap
  requires is_admin on the validated user.
- Structured JSON logging via structlog with trace_id + user_id
  ContextVars attached to every log line.
- Test suite (pytest + aiosqlite + respx) covers CRUD, user scoping,
  streaming SSE persistence, regenerate, model proxy admin gate,
  and the stream proxy error path. 16/16 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 11:49:51 -07:00

37 lines
1.1 KiB
Python

"""Runtime configuration for the chat API service."""
from __future__ import annotations
from functools import lru_cache
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(env_file=".env", extra="ignore")
database_url: str = Field(default="postgresql+asyncpg://localhost/samosachaat")
auth_service_url: str = Field(default="http://auth:8001")
inference_service_url: str = Field(default="http://inference:8000")
internal_api_key: str = Field(default="")
max_conversation_history: int = Field(default=50)
max_token_budget: int = Field(default=6000)
auth_cache_ttl_seconds: int = Field(default=300)
auth_cache_max_size: int = Field(default=1024)
inference_default_temperature: float = Field(default=0.8)
inference_default_max_tokens: int = Field(default=512)
inference_default_top_k: int = Field(default=50)
frontend_url: str = Field(default="http://localhost:3000")
log_level: str = Field(default="INFO")
@lru_cache(maxsize=1)
def get_settings() -> Settings:
return Settings()