• Joined on 2024-05-31
tacit synced commits to refs/pull/43/merge at tacit/nanochat from mirror 2025-10-16 03:12:15 +00:00
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
4c3590c499 fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints
03fa673b7d add basic logging to chat_web, which i think might be fun
52bfeea8bd add very basic abuse prevention limits to chat_web so it's ok to host endpoints
Compare 6 commits »
tacit synced commits to refs/pull/39/merge at tacit/nanochat from mirror 2025-10-16 03:12:15 +00:00
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
4c3590c499 fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints
03fa673b7d add basic logging to chat_web, which i think might be fun
52bfeea8bd add very basic abuse prevention limits to chat_web so it's ok to host endpoints
Compare 6 commits »
tacit synced commits to refs/pull/31/merge at tacit/nanochat from mirror 2025-10-16 03:12:14 +00:00
4346536ab2 also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate
2846999b8f allow user to click on their message to edit them. conversation after that point is wiped
92d52ecc92 add slash commands to webui
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
Compare 9 commits »
tacit synced commits to refs/pull/24/merge at tacit/nanochat from mirror 2025-10-16 03:12:14 +00:00
4346536ab2 also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate
2846999b8f allow user to click on their message to edit them. conversation after that point is wiped
92d52ecc92 add slash commands to webui
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
Compare 9 commits »
tacit synced commits to refs/pull/3/merge at tacit/nanochat from mirror 2025-10-16 03:12:14 +00:00
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
4c3590c499 fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints
03fa673b7d add basic logging to chat_web, which i think might be fun
52bfeea8bd add very basic abuse prevention limits to chat_web so it's ok to host endpoints
Compare 6 commits »
tacit synced commits to refs/pull/30/merge at tacit/nanochat from mirror 2025-10-16 03:12:14 +00:00
4346536ab2 also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate
2846999b8f allow user to click on their message to edit them. conversation after that point is wiped
92d52ecc92 add slash commands to webui
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
Compare 9 commits »
tacit synced commits to refs/pull/27/merge at tacit/nanochat from mirror 2025-10-16 03:12:14 +00:00
4346536ab2 also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate
2846999b8f allow user to click on their message to edit them. conversation after that point is wiped
92d52ecc92 add slash commands to webui
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
Compare 9 commits »
tacit synced commits to refs/pull/15/merge at tacit/nanochat from mirror 2025-10-16 03:12:13 +00:00
03fa673b7d add basic logging to chat_web, which i think might be fun
52bfeea8bd add very basic abuse prevention limits to chat_web so it's ok to host endpoints
01fb290f53 allow multiple GPUs to do inference in a data parallel way
Compare 4 commits »
tacit synced commits to refs/pull/19/merge at tacit/nanochat from mirror 2025-10-16 03:12:13 +00:00
4346536ab2 also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate
2846999b8f allow user to click on their message to edit them. conversation after that point is wiped
92d52ecc92 add slash commands to webui
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
Compare 9 commits »
tacit synced commits to refs/pull/18/merge at tacit/nanochat from mirror 2025-10-16 03:12:13 +00:00
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
4c3590c499 fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints
03fa673b7d add basic logging to chat_web, which i think might be fun
52bfeea8bd add very basic abuse prevention limits to chat_web so it's ok to host endpoints
Compare 6 commits »
tacit synced commits to refs/pull/13/merge at tacit/nanochat from mirror 2025-10-16 03:12:13 +00:00
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
4c3590c499 fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints
03fa673b7d add basic logging to chat_web, which i think might be fun
52bfeea8bd add very basic abuse prevention limits to chat_web so it's ok to host endpoints
Compare 6 commits »
tacit synced commits to master at tacit/nanochat from mirror 2025-10-16 03:12:12 +00:00
4346536ab2 also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate
2846999b8f allow user to click on their message to edit them. conversation after that point is wiped
92d52ecc92 add slash commands to webui
fae3aca951 add script to train a 000 version of nanochat. currently it's a bit more like 00 and this would run in probably around 33 hours instead of the budget of 41 hours, so we might tune it later. i think it's ok for now
4c3590c499 fix subtle issue in token decoding in cases where multiple utf8 bytes need to be emitted into a single codepoint. exampels are emoji or foreign languages. basically we have to accumulate token sequences/text and only emit when we get full codepoints
Compare 8 commits »
tacit synced commits to refs/pull/63/merge at tacit/nanochat from mirror 2025-10-15 19:02:16 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »
tacit synced commits to refs/pull/59/merge at tacit/nanochat from mirror 2025-10-15 19:02:16 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
42b05eea7e Add guard against division by zero in chat_sft when num_tokens is 0
Compare 4 commits »
tacit synced commits to refs/pull/6/merge at tacit/nanochat from mirror 2025-10-15 19:02:16 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »
tacit synced commits to refs/pull/61/merge at tacit/nanochat from mirror 2025-10-15 19:02:16 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »
tacit synced commits to refs/pull/62/merge at tacit/nanochat from mirror 2025-10-15 19:02:16 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »
tacit synced commits to refs/pull/51/head at tacit/nanochat from mirror 2025-10-15 19:02:15 +00:00
f9dd11fefe Enhance error handling in dataset and training scripts
tacit synced commits to refs/pull/39/merge at tacit/nanochat from mirror 2025-10-15 19:02:15 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »
tacit synced commits to refs/pull/53/merge at tacit/nanochat from mirror 2025-10-15 19:02:15 +00:00
190d9515d0 dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports
b8076dd367 fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68. also add --dry_run option useful for experimentation
Compare 3 commits »