Commit Graph

2 Commits

Author SHA1 Message Date
Manmohan Sharma
40586713bd
fix KV cache dtype mismatch on CPU: use COMPUTE_DTYPE instead of hardcoded logic
The KV cache was hardcoded to float32 on non-CUDA devices, but the model
weights are loaded in bfloat16 via NANOCHAT_DTYPE env var. This caused a
RuntimeError in scaled_dot_product_attention. Now uses COMPUTE_DTYPE from
common.py which respects the env var.

Also broadened CI/CD path triggers to nanochat/**.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:04:33 -04:00
Manmohan Sharma
c3f683f3e3
add CI/CD auto-deploy workflow for samosaChaat
Deploys to EC2 on push to master when UI/server files change.
Uses appleboy/ssh-action with stored secrets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:00:25 -04:00