From 48804bff3a487e43ee1e1533b3cfa0aa5ab0028f Mon Sep 17 00:00:00 2001 From: Andrej Karpathy Date: Wed, 18 Feb 2026 23:45:31 +0000 Subject: [PATCH] report negative result on fineweb dataset --- dev/LOG.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/dev/LOG.md b/dev/LOG.md index c0d35e4..6ac027c 100644 --- a/dev/LOG.md +++ b/dev/LOG.md @@ -4,6 +4,16 @@ A running summary documenting some experiments and findings. Started ~Jan 7 2026 --- +## 2026-02-17: Pretraining Data: FineWeb (negative) + +Tried vanilla fineweb instead of fineweb-edu dataset. Significantly, shockingly worse results: + +- d26 (GPT-2): CORE 0.2602 → 0.2241 + +This is the fifth failed attempt to beat pure FineWeb-EDU on CORE score. + +--- + ## 2026-02-17: Pretraining Data Mixture Experiment (negative) Tried [hynky/finepdfs_50BT-dclm_30BT-fineweb_edu_20BT](https://huggingface.co/datasets/hynky/finepdfs_50BT-dclm_30BT-fineweb_edu_20BT), a mixture of FinePDFs, DCLM, and FineWeb-EDU. Slightly worse on both model sizes tested: