nanochat/scripts
2025-11-05 21:08:30 +00:00
..
base_eval.py add explicit UTF-8 encoding 2025-11-03 21:27:12 +01:00
base_loss.py many small tweaks. base, eval, core work now i think 2025-10-16 15:46:18 -07:00
base_train.py grad clip logging and printing and cosmetics 2025-11-05 21:08:30 +00:00
chat_cli.py upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming 2025-10-20 10:15:17 -07:00
chat_eval.py typo fixes in scripts 2025-10-28 20:17:31 +01:00
chat_rl.py typo fixes in scripts 2025-10-28 20:17:31 +01:00
chat_sft.py add the SpellingBee task so that nanochat can count r in strawberry etc. along the way we had to add a bunch of new functionality, e.g. extend the calculator to support the count function of python. possibly the current TaskMixture uses way too many synthetic examples of SpellingBee because the eval gives us exactly 100% performance on spelling. We can tune this later to reclaim some wall clock time here I think 2025-10-24 14:02:48 +00:00
chat_web.py ensure consistency of quotes within each statement 2025-11-03 21:52:02 +01:00
mid_train.py Fix tok/sec metrics for base_train and mid_train when gradient accumulation is not 1 2025-10-26 01:43:49 -05:00
tok_eval.py initial commit 2025-10-13 06:49:24 -07:00
tok_train.py initial commit 2025-10-13 06:49:24 -07:00