nanochat/scripts
Unsal Gokdag 4f79e750e7 CORE eval: batched forwarding by default, per-example mode for verification
Switch cached eval path to batched=True (forwards full collated batches)
      for ~5-7x speedup over sequential per-example evaluation. Add per-example
      forwarding mode (batched=False) that trims collation padding to recover
      exact per-example tensor shapes, guaranteeing identical results to the
      old sequential path. Bench script uses batched=True for speed sweeps and
      per-example mode for correctness verification against old.
2026-02-13 08:42:45 +00:00
..
base_eval.py CORE eval: batched forwarding by default, per-example mode for verification 2026-02-13 08:42:45 +00:00
base_train.py speed up CORE metric evaluation: batched GPU forward passes, threaded CPU prep, cross-call caching. first eval pipelines tokenization on a background thread while GPU processes the previous batch. second+ evals skip tokenization and collation entirely, only GPU forward passes remain. Also adds a benchmark script to sweep batch_size and queue_size hyperparameters. 2026-02-12 18:13:56 +01:00
bench_core_eval.py CORE eval: batched forwarding by default, per-example mode for verification 2026-02-13 08:42:45 +00:00
chat_cli.py remove leftover mid references (#491) 2026-02-02 08:33:46 -08:00
chat_eval.py remove leftover mid references (#491) 2026-02-02 08:33:46 -08:00
chat_rl.py remove leftover mid references (#491) 2026-02-02 08:33:46 -08:00
chat_sft.py fix bug in chat_sft, the attention window must be preserved sigh 2026-02-01 20:58:44 +00:00
chat_web.py remove leftover mid references (#491) 2026-02-02 08:33:46 -08:00
tok_eval.py initial commit 2025-10-13 06:49:24 -07:00
tok_train.py quick fix to not OOM main speedrun script 2026-01-26 22:31:42 +00:00