nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-05-08 16:59:59 +00:00

History

Manmohan Sharma 40586713bd fix KV cache dtype mismatch on CPU: use COMPUTE_DTYPE instead of hardcoded logic The KV cache was hardcoded to float32 on non-CUDA devices, but the model weights are loaded in bfloat16 via NANOCHAT_DTYPE env var. This caused a RuntimeError in scaled_dot_product_attention. Now uses COMPUTE_DTYPE from common.py which respects the env var. Also broadened CI/CD path triggers to nanochat/**. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 10:04:33 -04:00
..
workflows	fix KV cache dtype mismatch on CPU: use COMPUTE_DTYPE instead of hardcoded logic	2026-03-23 10:04:33 -04:00

Manmohan Sharma 40586713bd

fix KV cache dtype mismatch on CPU: use COMPUTE_DTYPE instead of hardcoded logic

The KV cache was hardcoded to float32 on non-CUDA devices, but the model
weights are loaded in bfloat16 via NANOCHAT_DTYPE env var. This caused a
RuntimeError in scaled_dot_product_attention. Now uses COMPUTE_DTYPE from
common.py which respects the env var.

Also broadened CI/CD path triggers to nanochat/**.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-23 10:04:33 -04:00

workflows

fix KV cache dtype mismatch on CPU: use COMPUTE_DTYPE instead of hardcoded logic

2026-03-23 10:04:33 -04:00