mirror of
https://github.com/karpathy/nanochat.git
synced 2026-05-26 01:28:01 +00:00
Integrates DeepSeek's Engram (N-gram hash lookup + context-aware gating + depthwise causal conv) as an optional module behind --engram CLI flag. Placed at two layers per paper ablation findings (layer 1 and n_layer//2-1). Coexists with existing Value Embeddings; disabled by default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| base_eval.py | ||
| base_train.py | ||
| chat_cli.py | ||
| chat_eval.py | ||
| chat_rl.py | ||
| chat_sft.py | ||
| chat_web.py | ||
| tok_eval.py | ||
| tok_train.py | ||