nanochat/.github
Manmohan Sharma 40586713bd
fix KV cache dtype mismatch on CPU: use COMPUTE_DTYPE instead of hardcoded logic
The KV cache was hardcoded to float32 on non-CUDA devices, but the model
weights are loaded in bfloat16 via NANOCHAT_DTYPE env var. This caused a
RuntimeError in scaled_dot_product_attention. Now uses COMPUTE_DTYPE from
common.py which respects the env var.

Also broadened CI/CD path triggers to nanochat/**.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:04:33 -04:00
..
workflows fix KV cache dtype mismatch on CPU: use COMPUTE_DTYPE instead of hardcoded logic 2026-03-23 10:04:33 -04:00