nanochat/scripts
santhoshravindran7 3fa394c93f security: fix unsafe deserialization, XSS, HTTPS enforcement, and temp file race
Five targeted security fixes — all non-breaking, no behaviour change on the happy path.

H-1 (High) — nanochat/checkpoint_manager.py
  Add weights_only=True to all three torch.load() calls.
  torch.load() uses pickle by default; loading a malicious .pt file from an
  untrusted source allows arbitrary code execution. weights_only=True restricts
  deserialization to tensors and primitives, blocking this attack surface.
  Refs: https://pytorch.org/docs/stable/generated/torch.load.html

H-3 (High) — nanochat/ui.html
  Replace innerHTML injection with createElement + textContent for error display.
  error.message was interpolated directly into innerHTML, creating an XSS sink:
  a crafted server error response could inject and execute arbitrary JavaScript.
  textContent escapes all HTML entities, closing the injection path.

L-1 (Low) — scripts/chat_web.py
  Fix misleading role validation error message.
  The error string claimed 'system' was a valid role, but the guard only accepts
  'user' and 'assistant'. Corrected to reflect the actual allowed values.

M-3 (Medium) — nanochat/common.py
  Reject non-HTTPS URLs in download_file_with_lock().
  urlopen() follows redirects including HTTPS->HTTP downgrades, enabling MITM
  attacks on downloaded model/tokenizer files. Added an explicit scheme check
  that raises ValueError for any non-HTTPS URL before the request is made.

L-3 (Low) — nanochat/dataset.py
  Replace predictable .tmp suffix with tempfile.NamedTemporaryFile.
  The previous filepath + '.tmp' naming caused a TOCTOU race when multiple
  worker processes downloaded the same shard concurrently, and is vulnerable
  to symlink attacks on shared filesystems. NamedTemporaryFile generates a
  unique path; os.replace() provides an atomic rename on POSIX.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-08 23:12:50 -07:00
..
base_eval.py delete autocast, an unnecessary thorn in my side, manage dtypes directly 2026-03-04 23:55:30 +00:00
base_train.py delete autocast, an unnecessary thorn in my side, manage dtypes directly 2026-03-04 23:55:30 +00:00
chat_cli.py delete autocast, an unnecessary thorn in my side, manage dtypes directly 2026-03-04 23:55:30 +00:00
chat_eval.py delete autocast, an unnecessary thorn in my side, manage dtypes directly 2026-03-04 23:55:30 +00:00
chat_rl.py delete autocast, an unnecessary thorn in my side, manage dtypes directly 2026-03-04 23:55:30 +00:00
chat_sft.py delete autocast, an unnecessary thorn in my side, manage dtypes directly 2026-03-04 23:55:30 +00:00
chat_web.py security: fix unsafe deserialization, XSS, HTTPS enforcement, and temp file race 2026-03-08 23:12:50 -07:00
tok_eval.py initial commit 2025-10-13 06:49:24 -07:00
tok_train.py quick fix to not OOM main speedrun script 2026-01-26 22:31:42 +00:00