nanochat/scripts
2026-01-05 18:57:46 +00:00
..
base_eval.py bugfix 2025-12-26 19:02:12 +08:00
base_loss.py delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts 2026-01-04 19:14:23 +00:00
base_train.py tune hyperparameters based on overnight sweeps. warmdown_ratio is the biggest free win, increasing 0.2 -> 0.4, and embedding lr can be larger bumping 0.2 -> 0.3 2026-01-05 18:57:46 +00:00
chat_cli.py upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming 2025-10-20 10:15:17 -07:00
chat_eval.py fix typos 2025-11-14 11:20:25 +01:00
chat_rl.py delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts 2026-01-04 19:14:23 +00:00
chat_sft.py delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts 2026-01-04 19:14:23 +00:00
chat_web.py ensure consistency of quotes within each statement 2025-11-03 21:52:02 +01:00
mid_train.py delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts 2026-01-04 19:14:23 +00:00
tok_eval.py initial commit 2025-10-13 06:49:24 -07:00
tok_train.py small change to doc string at top of tok_train.py (#402) 2025-12-31 12:57:26 -08:00