nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-05-08 16:59:59 +00:00

Author	SHA1	Message	Date
Manmohan	b5fbebb63f	Merge pull request #26 from manmohan659/fix/missing-models fix: add missing SQLAlchemy models to auth and chat-api	2026-04-16 16:50:22 -04:00
Manmohan Sharma	8a95a76522	fix: add missing models/ dirs to auth and chat-api services Root .gitignore had `models/` which matched both ML weights AND SQLAlchemy model dirs. Changed to `/models/` (root only). Added auth/src/models/ (User) and chat-api/src/models/ (Conversation, Message). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:50:08 -07:00
Manmohan Sharma	2061f8848b	fix(docker): add structlog + prometheus deps to auth and chat-api Dockerfiles Auth service was crash-looping with ModuleNotFoundError for prometheus_fastapi_instrumentator. Chat-api was also missing it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:46:53 -07:00
Manmohan Sharma	aa7a907063	feat(frontend): wire frontend to real backend auth + chat-api services Remove NextAuth and replace with token-based auth against the backend auth service (OAuth + JWT). The frontend now redirects login to /api/auth/google and /api/auth/github (proxied by nginx to the auth service), captures the JWT from the redirect query param, and uses it for all API calls. Key changes: - Remove next-auth dependency and all NextAuth config/routes - Add lib/auth-client.ts (JWT token storage + auth headers) - Add hooks/useAuth.ts (client-side auth state + token capture) - Rewrite middleware.ts to pass-through (client-side auth only) - Login page uses plain <a> links to /api/auth/{provider} - Chat page captures access_token from OAuth redirect - Zustand store fetches conversations from real chat-api via JWT - API routes proxy /api/conversations/* to chat-api with auth - chat/stream route supports conversationId + auth header forwarding - useSSE hook accepts auth headers for authenticated streaming - Sidebar loads conversations from API, supports delete - Landing page (Hero, LandingNav) uses useAuth instead of useSession - Add .env.production.example and scripts/generate-jwt-keys.sh Mock echo fallback preserved when CHAT_API_URL is not set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:21:38 -07:00
Manmohan Sharma	07892c0f00	fix(inference): regenerate uv.lock after structlog/prometheus deps added The observability PR added structlog and prometheus-fastapi-instrumentator to inference pyproject.toml but did not regenerate uv.lock, causing Docker build to fail with --locked flag. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 12:49:05 -07:00
Manmohan Sharma	aa0818aae2	feat(observability): Prometheus + Grafana + Loki stack for samosaChaat (#9 ) Replaces the helm/observability scaffold with a real monitoring stack wired into the samosaChaat platform. Helm chart (helm/observability/) - Chart.yaml declares kube-prometheus-stack (~62.0) and loki-stack (~2.10) as subchart dependencies. - values.yaml configures Prometheus (15d retention, 50Gi PVC, ServiceMonitor + rule selector on app.kubernetes.io/part-of: samosachaat), Alertmanager (10Gi PVC), Grafana (OAuth-only via GitHub + Google, local login disabled, Prometheus + Loki datasources, dashboards auto-provisioned from a ConfigMap, email + Slack contact points with a critical route to Slack), Loki (50Gi, 30d retention, tsdb schema), and Promtail (JSON pipeline that lifts level / service / trace_id / user_id into labels, scrape config with pod labels). - Alert rules: HighCPU, HighMemory, DiskSpaceLow, High5xxRate, InferenceServiceDown, HighP99Latency. - templates/grafana-dashboards-configmap.yaml renders every file under dashboards/ into a single grafana_dashboard=1 ConfigMap. - dashboards/node-health.json, app-performance.json, inference.json - fully-formed Grafana dashboards with Prometheus datasource variable, templated app selector, thresholded gauges, and LogQL-ready labels. Scraping (helm/samosachaat/templates/servicemonitor.yaml) - ServiceMonitor CRs for auth / chat-api / inference that Prometheus picks up via the part-of=samosachaat selector; scrapes /metrics every 15s and replaces the app label so dashboards line up. Application instrumentation - services/{auth,chat-api,inference} each depend on prometheus-fastapi-instrumentator and expose /metrics (request count, latency histograms, in-progress gauges). - services/auth/src/logging_setup.py and services/inference/src/logging_setup.py mirror the canonical chat-api implementation - structlog JSON with service, trace_id, user_id context injection. - configure_logging() is called at create_app() in auth and inference; inference's main.py now uses structlog via get_logger() instead of logging.getLogger. - log_level setting added to auth + inference config (LOG_LEVEL env). Docs - contracts/logging-standard.md defines the required JSON fields, Python (structlog) + Node.js (pino) implementations, LogQL examples for cross-service queries, and the x-trace-id propagation contract. Closes #9 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 12:29:16 -07:00
Manmohan	1e2fc09ca6	Merge pull request #17 from manmohan659/feat/chat-api-service feat(chat-api): conversation orchestration + SSE streaming proxy (#6)	2026-04-16 14:57:10 -04:00
Manmohan Sharma	8153a4fadf	feat(chat-api): conversation orchestration + SSE streaming proxy (#6 ) - FastAPI service that manages conversations and messages in PostgreSQL (SQLAlchemy 2.0 async + asyncpg) and streams assistant responses back to the client via sse-starlette, forwarding the inference service SSE contract unchanged. - Auth guard validates every request against the auth service /auth/validate endpoint (X-Internal-API-Key) and caches results in an in-process TTL cache (5 min, 1024 entries) to absorb request bursts. - Every query filters by authenticated user_id; cross-user access returns 404. Message send flow auto-titles the first message, persists the streamed assistant response after the client disconnects, and records token_count + inference_time_ms. - /api/models{,/swap} proxies the inference admin surface; swap requires is_admin on the validated user. - Structured JSON logging via structlog with trace_id + user_id ContextVars attached to every log line. - Test suite (pytest + aiosqlite + respx) covers CRUD, user scoping, streaming SSE persistence, regenerate, model proxy admin gate, and the stream proxy error path. 16/16 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 11:49:51 -07:00
Manmohan Sharma	4b4aca642a	feat(auth): OAuth2 + JWT auth service with Alembic migrations (#5 #7 ) - Alembic async migrations: users, conversations, messages, is_favorited - FastAPI auth service: Google + GitHub OAuth, RS256 JWT, refresh cookie - /auth/me, /auth/refresh, /auth/validate (service-to-service) - rate limiting 10/min on OAuth routes, CORS locked to FRONTEND_URL Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 11:47:00 -07:00
Manmohan Sharma	634be4080b	feat(frontend): Next.js 14 frontend service for samosaChaat (#2 ) Build services/frontend/ replacing the legacy nanochat/ui.html single-file UI. Landing, login, and chat pages ported with full design system: Devanagari + Great Vibes hero, samosa/chai/toran SVG animations, gold/cream palette. - App Router pages: / (hero + floating illustrations), /login (split-screen OAuth with mandala motif), /chat (260px collapsible sidebar, suggestion chips, markdown + code-copy, auto-expanding input, slash commands) - SSE streaming via useSSE hook and /api/chat/stream BFF route (proxies to CHAT_API_URL when set, falls back to mock echo for local dev) - NextAuth.js v5 with Google + GitHub providers; middleware gates /chat/* - Zustand store with localStorage persistence for conversations/settings - Tailwind theme carries all ui.html tokens + keyframes (pendulum, float, wobble, steamFloat, steamType); SVG assets componentized under components/svg - Multi-stage node:20-alpine Dockerfile with Next standalone output Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 11:26:57 -07:00
Manmohan Sharma	577771b890	extract standalone inference service	2026-04-16 11:19:18 -07:00
Manmohan Sharma	957f66181d	scaffold monorepo platform layout	2026-04-16 11:06:29 -07:00

12 Commits