nanochat/tests
2026-02-01 22:31:03 -06:00
..
test_attention_fallback.py Merge 181e7f1c15 into 230d6cf6c6 2026-02-01 22:31:03 -06:00
test_engine.py update the CPU/MPS script to give reasonable results. The model can at least answer that Paris is the capital of France and knows that the sky is blue, for about 40 minutes of training on my macbook. Also fixed a bug that existed due to KVCache bfloat16 dtype assumption 2026-01-17 12:27:30 -08:00