Default Branch

dc54a1a307 · tried and failed at DyT · Updated 2026-05-05 03:17:21 +00:00

Branches

moe

5422d3a132 · make sure to use active params in scaling laws · Updated 2026-02-19 02:46:36 +00:00

43
4

69b1ed245e · also add base_train change example for how to swap LinearFP8 · Updated 2026-01-13 17:08:10 +00:00

159
2