nanochat/scripts
2025-10-15 14:58:06 -07:00
..
base_eval.py initial commit 2025-10-13 06:49:24 -07:00
base_loss.py initial commit 2025-10-13 06:49:24 -07:00
base_train.py Refactor constants in training scripts and engine to improve configurability. Replace hardcoded values with constants for KV cache growth, rotary cache multiplier, and learning rate parameters. This enhances maintainability and allows for easier adjustments in future iterations. 2025-10-15 14:58:06 -07:00
chat_cli.py initial commit 2025-10-13 06:49:24 -07:00
chat_eval.py initial commit 2025-10-13 06:49:24 -07:00
chat_rl.py initial commit 2025-10-13 06:49:24 -07:00
chat_sft.py dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports 2025-10-15 16:42:23 +00:00
chat_web.py fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints 2025-10-15 20:29:54 +00:00
mid_train.py Refactor constants in training scripts and engine to improve configurability. Replace hardcoded values with constants for KV cache growth, rotary cache multiplier, and learning rate parameters. This enhances maintainability and allows for easier adjustments in future iterations. 2025-10-15 14:58:06 -07:00
tok_eval.py initial commit 2025-10-13 06:49:24 -07:00
tok_train.py initial commit 2025-10-13 06:49:24 -07:00