nanochat/.github/workflows/deploy.yml
Manmohan Sharma 40586713bd
fix KV cache dtype mismatch on CPU: use COMPUTE_DTYPE instead of hardcoded logic
The KV cache was hardcoded to float32 on non-CUDA devices, but the model
weights are loaded in bfloat16 via NANOCHAT_DTYPE env var. This caused a
RuntimeError in scaled_dot_product_attention. Now uses COMPUTE_DTYPE from
common.py which respects the env var.

Also broadened CI/CD path triggers to nanochat/**.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:04:33 -04:00

30 lines
717 B
YAML

name: Deploy samosaChaat to EC2
on:
push:
branches: [master]
paths:
- 'nanochat/**'
- 'scripts/chat_web.py'
- 'scripts/chat_cli.py'
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Deploy to EC2
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.EC2_HOST }}
username: ubuntu
key: ${{ secrets.EC2_SSH_KEY }}
script: |
cd /home/ubuntu/nanochat
git fetch origin master
git reset --hard origin/master
sudo systemctl restart samosachaat.service
echo "Deploy complete - samosaChaat restarted"