Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warn if generation prompt does not start with a control code #50

Merged
merged 2 commits into from
Oct 23, 2019

Conversation

julien-c
Copy link
Contributor

No description provided.

julien-c added a commit to huggingface/transformers that referenced this pull request Oct 22, 2019
@dimitri320
Copy link

@julien-c great idea, totally get your commit, but how is the control code/token identified right now in the master branch? I don't get how control_codes.txt is being referenced from generate.py right now? Or am I missing something? I'm pretty sure I setup everything right on my V100, but the master branch just doesn't work (runs, but copies the last word over and over, so likely it doesn't see my control code) while the lower_memory branch works just fine. Totally lost to be honest...

@julien-c
Copy link
Contributor Author

Did you try all the different models? Maybe it's an issue with a specific one?

@dimitri320
Copy link

Yes, I’ve tried both the 512 and 256, and in both cases the master branch didn’t work, while the lower memory branch worked. That’s why I started looking into the code, and a bit lost where/how control codes are being used.

@julien-c
Copy link
Contributor Author

A control code is just a token. So if you start with Joke A man comes into a bar "Joke" is just a token like the other ones. Not sure I can help on debugging your specific issue though. Good luck!

@dimitri320
Copy link

@julien-c thanks a lot, will try running your code, maybe it’ll work. Just strange why my setup doesn’t;to see control code on the master branch.

@dimitri320
Copy link

@julien-c I’ve got an idea about what might be wrong. I think I might be patching the wrong keras.py file. Can you share the path of this file you are patching (as your setup is GCP with V100, just like mine). Im also using Anaconda Python 3.7 virtual environment setup.

@keskarnitish keskarnitish merged commit e900f8a into salesforce:master Oct 23, 2019
@keskarnitish
Copy link
Contributor

Thanks for the PR @julien-c !

@dimitri320 , you should patch wherever tensorflow_estimator is. You can find this by

>>> import tensorflow_estimator
>>> tensorflow_estimator.__file__

@dimitri320
Copy link

dimitri320 commented Oct 23, 2019

@keskarnitish Patched the correct file, and still I get the last work copied over and over again.... And I don't get the error coming up that no control word was used, as I am using control words (Links, Books, Wikipedia).

And btw, the new commit works, when I start with a non control word, it shows me the warning. Thanks for that @julien-c !

Any advice where else to look for an answer?

PS: already spent 3 days on this, really don't know what else to do...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants