nanochat/tests
Junyang Chen 767df6ef61 dataloader: reuse cropped remainders to reduce token waste ~35% -> ~23%
When the BestFit-Crop algorithm crops a document to fill remaining row space,
the leftover tokens are currently discarded. This change puts the remainder
(with BOS prepended) back into the document buffer for future rows.

Simulation results at T=2048 with realistic document length distribution:
- Source token consumption reduced by ~15%
- Data efficiency improved by ~1.18x
- Estimated ~28 minutes saved on d24 speedrun (3.04h -> ~2.57h)

The change is minimal (6 lines in the crop branch) and preserves all existing
properties: BOS-aligned rows, 100% utilization, deterministic packing order.
2026-02-18 23:04:43 -08:00
..
test_attention_fallback.py Fix SDPA KV-cache decode to respect sliding window (#456) 2026-01-30 17:32:12 +00:00
test_dataloader_remainder.py dataloader: reuse cropped remainders to reduce token waste ~35% -> ~23% 2026-02-18 23:04:43 -08:00
test_engine.py Fix MockModel's device definition (#535) 2026-02-17 16:03:46 -08:00