mirror of
https://github.com/karpathy/nanochat.git
synced 2025-12-06 04:12:13 +00:00
Add ToDo.md for tasks and roadmap
This commit is contained in:
parent
74b03694b1
commit
1eaaba1c64
31
ToDo.md
Normal file
31
ToDo.md
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
# Remaining Tasks & Roadmap
|
||||
|
||||
## 🚀 Optimization & Strix Halo Specifics
|
||||
- [ ] **MXFP4 Investigation**: Research and implement OCP Microscaling (MXFP4) support for inference using AMD Quark, once the ecosystem matures for APUs.
|
||||
- [ ] **System Tuner Expansion**: Enhance `scripts/tune_system.py` to auto-tune:
|
||||
- Learning rates and schedules.
|
||||
- Optimizer hyperparameters (momentum, weight decay).
|
||||
- Compilation flags (`torch.compile` modes).
|
||||
- [ ] **Torch Compile Dynamics**: Investigate `dynamic=True` vs `False` in `scripts/base_train.py` for variable sequence lengths on RDNA 3.5.
|
||||
- [ ] **Distributed Tuning**: Benchmark RCCL vs Gloo backends specifically for APU-based distributed setups (if scaling to multi-node APUs).
|
||||
|
||||
## 🛠 Codebase Maintenance & Tech Debt
|
||||
- [ ] **DDP Detection**: Refactor `is_ddp()` in `nanochat/common.py` to use a more robust detection method.
|
||||
- [ ] **Tokenizer Efficiency**: Optimize `prepend_id` insertion in `nanochat/tokenizer.py` (currently uses `list.insert(0)`, which is O(N)).
|
||||
- [ ] **Liger Kernels**: Experiment with [Liger Kernels](https://github.com/linkedin/Liger-Kernel) or chunked cross-entropy in `nanochat/gpt.py` to reduce memory usage.
|
||||
- [ ] **Checkpointing**:
|
||||
- Fix potentially redundant model re-initialization in `checkpoint_manager.py`.
|
||||
- Ensure optimizer state saving across ranks is robust (`scripts/base_train.py`).
|
||||
- [ ] **Evaluation Cleanup**: Refactor `scripts/base_eval.py` to remove heavy dependencies (like pandas) and simplify file handling.
|
||||
- [ ] **AdamW Warmup**: Experiment with short warmup periods for AdamW parameters (`scripts/base_train.py` TODO).
|
||||
|
||||
## ✨ New Features
|
||||
- [ ] **Model Export**:
|
||||
- Add a script to export checkpoints to **GGUF** format for efficient inference on Strix Halo NPU (via llama.cpp).
|
||||
- Add HuggingFace `safetensors` export support.
|
||||
- [ ] **Inference Server**: Create a production-ready API server (FastAPI) to serve the model, replacing the simple `chat_cli.py`.
|
||||
- [ ] **RLHF Expansion**: Extend Reinforcement Learning (RL) support beyond the current GSM8K-only implementation.
|
||||
- [ ] **Advanced UI**: Develop a more robust chat interface (React/Web) or integrate with existing open-source UIs (e.g., Open WebUI).
|
||||
- [ ] **Data Pipeline**:
|
||||
- Add data integrity verification for downloaded shards.
|
||||
- Optimize data loading for APU unified memory architectures.
|
||||
Loading…
Reference in New Issue
Block a user