mirror of
https://github.com/karpathy/nanochat.git
synced 2026-04-03 22:25:27 +00:00
When the BestFit-Crop algorithm crops a document to fill remaining row space, the leftover tokens are currently discarded. This change puts the remainder (with BOS prepended) back into the document buffer for future rows. Simulation results at T=2048 with realistic document length distribution: - Source token consumption reduced by ~15% - Data efficiency improved by ~1.18x - Estimated ~28 minutes saved on d24 speedrun (3.04h -> ~2.57h) The change is minimal (6 lines in the crop branch) and preserves all existing properties: BOS-aligned rows, 100% utilization, deterministic packing order. |
||
|---|---|---|
| .. | ||
| test_attention_fallback.py | ||
| test_dataloader_remainder.py | ||
| test_engine.py | ||