nanochat

mirror of https://github.com/karpathy/nanochat.git synced 2026-06-18 12:09:09 +00:00

Author	SHA1	Message	Date
Sermet Pekin	539f42bf89	check with 3.9...3.13	2025-10-20 21:35:34 +03:00
Sermet Pekin	c3d633665d	check with 3.9...3.13	2025-10-20 21:33:03 +03:00
Sermet Pekin	428ccb9eb1	multi platform gf wf windows encoding problem	2025-10-20 21:03:03 +03:00
Sermet Pekin	e8b86df766	multi platform gf wf	2025-10-20 20:55:08 +03:00
Sermet Pekin	3debc92022	multi platform gf wf	2025-10-20 20:50:49 +03:00
Sermet Pekin	8d75e112b6	gh wf multi platform	2025-10-20 20:44:39 +03:00
Sermet Pekin	116b0e75fc	mac for gh	2025-10-20 20:32:56 +03:00
Sermet Pekin	aac58b51dc	for now	2025-10-20 19:39:00 +03:00
Sermet Pekin	a3f5986f19	feat: Add macOS compatibility fixes - Change PyTorch dependency from CUDA to CPU version for macOS support - Update Rust edition from 2024 to 2021 for stable Cargo compatibility - Relax PyTorch version requirement from >=2.8.0 to >=2.0.0 - Update dependency lock file with compatible versions	2025-10-20 19:02:26 +03:00
Sermet Pekin	0768f67290	Add 'dev' branch to workflow triggers	2025-10-20 11:45:14 +03:00
Sermet Pekin	cdb5a455ee	fallback to cpu on compute_init function fallback to cpu on compute_init function	2025-10-20 11:43:47 +03:00
Sermet Pekin	11e46b6439	Add transformers dependency to pyproject.toml Add transformers dependency to pyproject.toml	2025-10-20 11:41:27 +03:00
Sermet Pekin	e238750824	Add GitHub Actions workflow for testing Python code GH workflow that - installs with uv - tests with pytest	2025-10-20 11:40:26 +03:00
Andrej	0f007889dd	Add MIT License as a file to the project	2025-10-19 17:22:19 -07:00
Andrej	5a879f4947	export NANOCHAT_BASE_DIR so child processes get it too	2025-10-19 17:07:56 -07:00
Andrej Karpathy	c1d2ed1c13	use orig_model in sampling, silly of me to miss this	2025-10-20 00:05:09 +00:00
Andrej Karpathy	2bc521a6de	use orig_model in sampling, silly of me to miss this	2025-10-20 00:04:15 +00:00
Andrej Karpathy	9467d83cf2	fix memory leak bug in rust tokenizer ty @mitsuhiko	2025-10-19 23:54:31 +00:00
Tancrède Lepoint	b1443dc98c	export NANOCHAT_BASE_DIR so child processes get it too	2025-10-19 14:05:40 -04:00
Andrej Karpathy	d6d86cbf4c	update readme with a link to the CPU\|MPS branch	2025-10-16 22:03:39 +00:00
Andrej Karpathy	ccfe7915ac	mention the current d32 chat hosted on nanochat.karpathy.ai, as an example endpoint of the repo	2025-10-16 19:32:44 +00:00
Andrej Karpathy	4346536ab2	also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate	2025-10-16 01:28:37 +00:00
Andrej Karpathy	2846999b8f	allow user to click on their message to edit them. conversation after that point is wiped	2025-10-16 01:16:22 +00:00
Andrej Karpathy	92d52ecc92	add slash commands to webui	2025-10-16 01:09:53 +00:00
Andrej Karpathy	fae3aca951	add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now	2025-10-15 20:32:22 +00:00
Andrej Karpathy	4c3590c499	fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints	2025-10-15 20:29:54 +00:00
Andrej Karpathy	03fa673b7d	add basic logging to chat_web, which i think might be fun	2025-10-15 19:51:06 +00:00
Andrej Karpathy	52bfeea8bd	add very basic abuse prevention limits to chat_web so it's ok to host endpoints	2025-10-15 19:42:54 +00:00
Andrej Karpathy	01fb290f53	allow multiple GPUs to do inference in a data parallel way	2025-10-15 19:12:19 +00:00
Andrej Karpathy	190d9515d0	dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports	2025-10-15 16:42:23 +00:00
Andrej Karpathy	b8076dd367	fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68 . also add --dry_run option useful for experimentation	2025-10-15 16:35:04 +00:00
Andrej	67aaca98f5	export NANOCHAT_BASE_DIR so child processes get it too Export the cache directory so that users can use their own cache location	2025-10-14 16:01:28 -07:00
Zach Mueller	f0855cbcc7	Update speedrun.sh	2025-10-14 14:12:01 -04:00
Andrej	dd6ff9a1cc	fix bug in fallback case of find_largest_model Fix: Handle missing d<number> model tags in find_largest_model ty	2025-10-13 14:38:34 -07:00
Mirza-Samad-Ahmed-Baig	afaa5b4c90	Fix: Handle missing d<number> model tags in find_largest_model	2025-10-14 00:24:07 +03:00
Andrej	5fd0b13886	Merge pull request #2 from epoyraz/patch-1 Update README.md	2025-10-13 10:10:15 -07:00
Enes Poyraz	6a795baf27	Update README.md fix typos	2025-10-13 18:40:12 +02:00
Andrej	626bd3e260	Add image of the WebUI to readme	2025-10-13 08:03:00 -07:00
karpathy	da96b46565	update link to the new discussion	2025-10-13 07:42:09 -07:00
karpathy	a53833d04f	add nanochat logo png	2025-10-13 06:59:59 -07:00
karpathy	3a5e0bc50b	initial commit	2025-10-13 06:49:24 -07:00

41 Commits