mirror of
https://github.com/karpathy/nanochat.git
synced 2026-04-06 07:35:32 +00:00
263 B
263 B
Tokenizer training
timestamp: 2026-02-01 14:40:20
- max_chars: 2,000,000,000
- doc_cap: 10,000
- vocab_size: 32,768
- train_time: 87.9820
- num_special_tokens: 9
- token_bytes_min: 1
- token_bytes_max: 19
- token_bytes_mean: 6.6029
- token_bytes_std: 2.8250