-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
T5 finetune outputting gibberish #7796
Comments
there is a |
Cool! Sorry for the n00biness.
|
|
Thanks! I am rerunning with the max length (I didn't see a spot for min length). I'm still a little confused as to why this happens though. For example,
Related: is there an easy flag to change so that I could view part of the validation outputs at each epoch to keep track of when it learns to truncate? Right now I'm just waiting until end of training to look at the test generations. |
|
Okay thanks, I will work on these. I realize these are unrelated T5 issues, but before I file other feature requests /bugs I just wanted to run them by you:
|
auto*: Would be nice if they worked! I probably won't fix either of these but would definitely accept a PR that allow clargs that currently don't work. If you can't fix, you could also make separate issues for clargs that don't work, label them "Help Wanted" and see what happens. |
@jsrozner did you We're also having some difficulties. Wanted to make sure if it has worked for someone else, at least. |
Did you pass |
See issue #5142 for resolution |
Environment info
transformers
version: 3.3.1Who can help
Summarization: @sshleifer
T5: @patrickvonplaten
examples/seq2seq: @sshleifer
Information
I am trying to finetune on a custom dataset. I posted about my specific use case here in the forums: https://discuss.huggingface.co/t/t5-tips-for-finetuning-on-crossword-clues-clue-answer/1514
The problem arises when using:
The tasks I am working on is:
To reproduce
(Note that I have changed nothing else)
python finetune.py \ --model_name_or_path=t5-small \ --tokenizer_name=t5-small \ --data_dir=${HOME}/data_set \ --learning_rate=3e-4 \ --output_dir=$OUTPUT_DIR \ --max_source_length=100 \ --max_target_length=100 \ --num_train_epochs=300 \ --train_batch_size=64 \ --eval_batch_size=64 \ --gpus=1 \ --auto_select_gpus=True \ --save_top_k=3 \ --output_dir=$OUTPUT_DIR \ --do_train \ --do_predict \ "$@"
As a baseline "does the T5 work", my input outputs are of the form (one per line)
(this is one line in train.source): This is a sentence
(this is corresponding line in train.target): This
The lines are exactly as above, with a new line after each example, but with no other punctuation. I have not modified tokens or the model.
Expected behavior
Expect T5 to learn to output the first word.
Observed
T5 outputs first word followed by gibberish:
After 300 epochs, here is what we see for the first 5 lines of source vs test_generation (test.target is just the first word of each line in test.source)
Test.source:
We raised a bloom, a monster
I let Satan corrupt and torment
Chapter in play is an old piece
Old skin disease liable to drain confidence
Keep a riot going inside a musical academy
test_generations:
We vsahmoastuosastostassymbossa
Issahrastahmoormentostormentastoshomment
Chapter vshygie'ny-futtahraffahtaftast
Old hygienohmahrastassahuasairtia
Keep'astifiahuassaivrasastoshygiesana
I wonder if any of the following could be affecting this:
The text was updated successfully, but these errors were encountered: