-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train-text-from-scratch oom (in tokenizer?) #4300
Comments
Yup, ubuntu 22.04 ./train-text-from-scratch --vocab-model ../models/ggml-vocab-llama.gguf --ctx 64 --embd 256 --head 8 --layer 16 --checkpoint-in chk-shakespeare-256x16-LATEST.gguf --checkpoint-out chk-shakespeare-256x16-ITERATION.gguf --model-out ggml-shakespeare-256x16-f32-ITERATION.gguf --train-data "shakespeare.txt" -t 6 -b 16 --seed 1 --adam-iter 256 --no-checkpointing -ngl 16 main: input_size = 131076128 bytes (125.0 MB) Thread 1 "train-text-from" received signal SIGABRT, Aborted.
Quit anyway? (y or n) y +---------------------------------------------------------------------------------------+ |
You can train with an older tag. It's definitely a memory issue compare this it needing 134terabytes of ram vs the actual 669 it needs. main: compute_size = 701759840 bytes (669.3 MB) |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
Running
train-text-from-scratch
with a 4GiB file works.Current Behavior
Running
train-text-from-scratch
with a 4GiB file ooms after allocating over 32GiB of memory.Environment and Context
with commit 5a7d312
Failure Information (for bugs)
It seems like there's a bug (or unoptimized code) in the tokenizer that causes it to allocate way more memory than necessary,
Steps to Reproduce
./result/bin/train-text-from-scratch \ --vocab-model ./models/ggml-vocab-llama.gguf \ --ctx 256 --embd 256 --head 8 --layer 16 \ --checkpoint-in chk-shakespeare-256x16-LATEST.gguf \ --checkpoint-out chk-shakespeare-256x16-ITERATION.gguf \ --model-out ggml-shakespeare-256x16-f32-ITERATION.gguf \ --train-data "/path/to/large/file.txt" \ -b 16 --seed 1337 --adam-iter 256
Failure Logs
The text was updated successfully, but these errors were encountered: