Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple text #528

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Simple text #528

wants to merge 3 commits into from

Conversation

dmahurin
Copy link

These changes add support for training with tinyshakesphere (change from llama2.py), and simple blank line separated text.

@xpww
Copy link

xpww commented Jul 5, 2024

Hello! Excuse me, I wrote a tinytext.txt of about dozens of lines.
When I used

python tinyshakespeare.py 
pretokenize and python train.py --dataset=tinyshakespeare

, the following error occurred:

assert num_batches > 0, "this split is way too small? investigate." 

I just started to use llm. llama2.c can make it run on my own computer, but I don't have enough basic knowledge to quickly start training a large model of my own.

Could you please provide me with an example of a related tinytext.txt file? Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants