• Joined on 2024-05-31
tacit synced commits to refs/pull/93/merge at tacit/nanochat from mirror 2026-01-12 22:43:45 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/429/merge at tacit/nanochat from mirror 2026-01-12 22:43:45 +00:00
48aaa4b3df Download the minimum number of parquet shards to train the tokenizer reproducibly
Compare 2 commits »
tacit synced commits to refs/pull/414/merge at tacit/nanochat from mirror 2026-01-12 22:43:44 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/407/merge at tacit/nanochat from mirror 2026-01-12 22:43:44 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/425/merge at tacit/nanochat from mirror 2026-01-12 22:43:44 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/412/merge at tacit/nanochat from mirror 2026-01-12 22:43:44 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/409/merge at tacit/nanochat from mirror 2026-01-12 22:43:44 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/151/merge at tacit/nanochat from mirror 2026-01-12 22:43:43 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/324/merge at tacit/nanochat from mirror 2026-01-12 22:43:43 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/141/merge at tacit/nanochat from mirror 2026-01-12 22:43:43 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/328/merge at tacit/nanochat from mirror 2026-01-12 22:43:43 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
b33e394528 oops actually make SSSL the default window pattern
Compare 19 commits »
tacit synced commits to refs/pull/370/merge at tacit/nanochat from mirror 2026-01-12 22:43:43 +00:00
4610a838a1 record negative result on MTP
Compare 2 commits »
tacit synced commits to refs/pull/396/merge at tacit/nanochat from mirror 2026-01-12 22:43:43 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/433/merge at tacit/nanochat from mirror 2026-01-12 14:33:50 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/59/merge at tacit/nanochat from mirror 2026-01-12 14:33:50 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/312/merge at tacit/nanochat from mirror 2026-01-12 14:33:49 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
b33e394528 oops actually make SSSL the default window pattern
Compare 22 commits »
tacit synced commits to refs/pull/399/merge at tacit/nanochat from mirror 2026-01-12 14:33:49 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
b33e394528 oops actually make SSSL the default window pattern
Compare 10 commits »
tacit synced commits to refs/pull/429/merge at tacit/nanochat from mirror 2026-01-12 14:33:49 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
Compare 4 commits »
tacit synced commits to refs/pull/400/merge at tacit/nanochat from mirror 2026-01-12 14:33:49 +00:00
4610a838a1 record negative result on MTP
Compare 2 commits »
tacit synced commits to refs/pull/311/merge at tacit/nanochat from mirror 2026-01-12 14:33:48 +00:00
4610a838a1 record negative result on MTP
21608ec51e allow base_loss to report the loss of any arbitrary huggingface model similar to base_eval. had to change dataloader to be a lot better and just take tokenizer, not load the nanochat one. much better this way anyway
aa95fb2e03 make miniseries more generic and easier to run and less hard coded
b33e394528 oops actually make SSSL the default window pattern
Compare 22 commits »