mirror of
https://github.com/karpathy/nanochat.git
synced 2026-05-27 18:18:07 +00:00
Integrates DeepSeek's Engram (N-gram hash lookup + context-aware gating + depthwise causal conv) as an optional module behind --engram CLI flag. Placed at two layers per paper ablation findings (layer 1 and n_layer//2-1). Coexists with existing Value Embeddings; disabled by default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| test_attention_fallback.py | ||
| test_engine.py | ||
| test_engram.py | ||