mirror of
https://github.com/karpathy/nanochat.git
synced 2026-04-15 13:28:38 +00:00
590 B
590 B
Midtraining
timestamp: 2025-11-30 21:47:41
- wandb_run_name: dummy
- vertex_experiment: nanochat-experiment
- vertex_tensorboard: projects/247010501180/locations/us-central1/tensorboards/8180826106513850368
- device_type:
- dtype: bfloat16
- num_iterations: -1
- max_seq_len: 2048
- device_batch_size: 8
- unembedding_lr: 0.0040
- embedding_lr: 0.2000
- matrix_lr: 0.0200
- init_lr_frac: 1.0000
- weight_decay: 0.0000
- eval_every: 150
- eval_tokens: 10,485,760
- total_batch_size: 524,288
- dry_run: 0
- Number of iterations: 813
- DDP world size: 1
- Minimum validation bpb: 0.4203