nanochat/tests
2026-02-01 18:36:07 +07:00
..
test_attention_fallback.py Fix SDPA KV-cache for per-row cache_seqlens 2026-02-01 18:36:07 +07:00
test_engine.py update the CPU/MPS script to give reasonable results. The model can at least answer that Paris is the capital of France and knows that the sky is blue, for about 40 minutes of training on my macbook. Also fixed a bug that existed due to KVCache bfloat16 dtype assumption 2026-01-17 12:27:30 -08:00