mirror of
https://github.com/karpathy/nanochat.git
synced 2026-05-08 16:59:59 +00:00
| .. | ||
| src | ||
| tests | ||
| Dockerfile | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
Inference Service
Standalone FastAPI microservice for nanochat model serving.
Endpoints
POST /generatestreams model output as SSEGET /modelslists registered and loaded weightsPOST /models/swapdrains workers and hot-swaps the active weightsGET /healthreports readinessGET /statsreports worker pool state
Environment
MODEL_STORAGE_PATHDEFAULT_MODEL_TAGHF_TOKENINTERNAL_API_KEYNANOCHAT_DTYPENUM_WORKERS
Run locally with:
uv run --project services/inference uvicorn main:app --app-dir services/inference/src --reload --port 8003