nanochat/scripts
2026-01-16 17:37:51 +00:00
..
base_eval.py bugfix 2025-12-26 19:02:12 +08:00
base_loss.py fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way 2026-01-13 22:45:27 +00:00
base_train.py implement flash attention 3 fallback to pytorch sdpa by touching as few lines of code as possible in main files and keeping all implementation to a single file. add tests. add helpful warning messages for the user. 2026-01-16 17:37:51 +00:00
chat_cli.py
chat_eval.py Fix args in readme (#438) 2026-01-15 16:26:38 -08:00
chat_rl.py typo in comments: change "GAPO" to "DAPO" 2026-01-15 22:03:42 -08:00
chat_sft.py fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way 2026-01-13 22:45:27 +00:00
chat_web.py
mid_train.py fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way 2026-01-13 22:45:27 +00:00
tok_eval.py
tok_train.py fix buggy midtrain and update all kwargs to be idiomatic. that is, argparse uses dashes variables use underscores. the underscores are just a remnant of the previous Configurator object. This is the right way 2026-01-13 22:45:27 +00:00