Vilhelm Toivonen
|
8deb27996f
|
Merge 6d51049077 into 190d9515d0
|
2025-10-15 13:36:07 -04:00 |
|
Andrej Karpathy
|
190d9515d0
|
dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
|
2025-10-15 16:42:23 +00:00 |
|
Andrej Karpathy
|
b8076dd367
|
fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
|
2025-10-15 16:35:04 +00:00 |
|
Vilhelm Toivonen
|
6d51049077
|
introduce lr schedulers and tests
|
2025-10-13 22:13:37 +03:00 |
|
karpathy
|
3a5e0bc50b
|
initial commit
|
2025-10-13 06:49:24 -07:00 |
|