nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-04-02 13:45:21 +00:00

Author	SHA1	Message	Date
santhoshravindran7	3fa394c93f	security: fix unsafe deserialization, XSS, HTTPS enforcement, and temp file race Five targeted security fixes — all non-breaking, no behaviour change on the happy path. H-1 (High) — nanochat/checkpoint_manager.py Add weights_only=True to all three torch.load() calls. torch.load() uses pickle by default; loading a malicious .pt file from an untrusted source allows arbitrary code execution. weights_only=True restricts deserialization to tensors and primitives, blocking this attack surface. Refs: https://pytorch.org/docs/stable/generated/torch.load.html H-3 (High) — nanochat/ui.html Replace innerHTML injection with createElement + textContent for error display. error.message was interpolated directly into innerHTML, creating an XSS sink: a crafted server error response could inject and execute arbitrary JavaScript. textContent escapes all HTML entities, closing the injection path. L-1 (Low) — scripts/chat_web.py Fix misleading role validation error message. The error string claimed 'system' was a valid role, but the guard only accepts 'user' and 'assistant'. Corrected to reflect the actual allowed values. M-3 (Medium) — nanochat/common.py Reject non-HTTPS URLs in download_file_with_lock(). urlopen() follows redirects including HTTPS->HTTP downgrades, enabling MITM attacks on downloaded model/tokenizer files. Added an explicit scheme check that raises ValueError for any non-HTTPS URL before the request is made. L-3 (Low) — nanochat/dataset.py Replace predictable .tmp suffix with tempfile.NamedTemporaryFile. The previous filepath + '.tmp' naming caused a TOCTOU race when multiple worker processes downloaded the same shard concurrently, and is vulnerable to symlink attacks on shared filesystems. NamedTemporaryFile generates a unique path; os.replace() provides an atomic rename on POSIX. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-08 23:12:50 -07:00
Andrej Karpathy	1076f97059	delete autocast, an unnecessary thorn in my side, manage dtypes directly	2026-03-04 23:55:30 +00:00
Sofie Van Landeghem	72b9064f9d	remove leftover mid references (#491 )	2026-02-02 08:33:46 -08:00
Aarushi Singh	ace6740bdd	feat: allow top_k=0 in web api to disable filtering (#458 ) * allow top_k=0 in web api to disable filtering * adding a comment for clear reasoning * adding change to docstring	2026-01-30 09:21:41 -08:00
svlandeg	2ce62ec076	ensure consistency of quotes within each statement	2025-11-03 21:52:02 +01:00
svlandeg	c72b8b2309	add explicit UTF-8 encoding	2025-11-03 21:27:12 +01:00
karpathy	2e9669e03a	upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming	2025-10-20 10:15:17 -07:00
Andrej Karpathy	4346536ab2	also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate	2025-10-16 01:28:37 +00:00
Andrej Karpathy	4c3590c499	fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints	2025-10-15 20:29:54 +00:00
Andrej Karpathy	03fa673b7d	add basic logging to chat_web, which i think might be fun	2025-10-15 19:51:06 +00:00
Andrej Karpathy	52bfeea8bd	add very basic abuse prevention limits to chat_web so it's ok to host endpoints	2025-10-15 19:42:54 +00:00
Andrej Karpathy	01fb290f53	allow multiple GPUs to do inference in a data parallel way	2025-10-15 19:12:19 +00:00
karpathy	3a5e0bc50b	initial commit	2025-10-13 06:49:24 -07:00

13 Commits