nanochat/tests
William Thurston 5f13389568 Implement reset_parameters method in MoEFeedForward and update GPT to utilize it
- Added a reset_parameters method in MoEFeedForward to reinitialize expert parameters.
- Updated the GPT class to call reset_parameters for MoEFeedForward instances during weight initialization.
- Introduced a new test in test_moe.py to validate gradient updates for MoE experts, ensuring proper functionality during training.
2025-11-13 17:09:11 -08:00
..
test_moe.py Implement reset_parameters method in MoEFeedForward and update GPT to utilize it 2025-11-13 17:09:11 -08:00
test_rustbpe.py initial commit 2025-10-13 06:49:24 -07:00