nanochat/modal
Manmohan Sharma e5b4db1eee
feat(modal): add Modal GPU inference endpoint for samosaChaat
- modal/serve.py: FastAPI endpoint on Modal T4 GPU, streams SSE tokens
- modal/_model.py: Standalone GPT model (auto-detects architecture from checkpoint)
- modal/_tokenizer.py: Standalone BPE tokenizer (tiktoken-based)
- Downloads nanochat-students/base-d20 weights from HuggingFace
- Deployed at: https://manmohan659--samosachaat-inference-inference-generate.modal.run

Deploy: modal deploy modal/serve.py
Dev:    modal serve modal/serve.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:32:09 -07:00
..
_model.py feat(modal): add Modal GPU inference endpoint for samosaChaat 2026-04-16 14:32:09 -07:00
_tokenizer.py feat(modal): add Modal GPU inference endpoint for samosaChaat 2026-04-16 14:32:09 -07:00
serve.py feat(modal): add Modal GPU inference endpoint for samosaChaat 2026-04-16 14:32:09 -07:00