mirror of
https://github.com/karpathy/nanochat.git
synced 2026-04-17 14:28:24 +00:00
473 B
473 B
SFT
timestamp: 2026-02-02 01:13:02
- run: dummy
- device_type:
- dtype: bfloat16
- model_tag: None
- model_step: None
- num_iterations: -1
- max_seq_len: 2048
- device_batch_size: 16
- total_batch_size: 524,288
- embedding_lr: 0.2000
- unembedding_lr: 0.0040
- matrix_lr: 0.0200
- weight_decay: 0.0000
- init_lr_frac: 1.0000
- eval_every: 150
- eval_tokens: 10,485,760
- dry_run: False
- Number of iterations: 849
- DDP world size: 8
- Minimum validation bpb: 0.3478