mirror of
https://github.com/karpathy/nanochat.git
synced 2026-05-08 16:59:59 +00:00
610 B
610 B
Inference Service
Standalone FastAPI microservice for nanochat model serving.
Endpoints
POST /generatestreams model output as SSEGET /modelslists registered and loaded weightsPOST /models/swapdrains workers and hot-swaps the active weightsGET /healthreports readinessGET /statsreports worker pool state
Environment
MODEL_STORAGE_PATHDEFAULT_MODEL_TAGHF_TOKENINTERNAL_API_KEYNANOCHAT_DTYPENUM_WORKERS
Run locally with:
uv run --project services/inference uvicorn main:app --app-dir services/inference/src --reload --port 8003