Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

[TGA] Delayed nucleus #4650

Merged
merged 6 commits into from
Jul 25, 2022
Merged

[TGA] Delayed nucleus #4650

merged 6 commits into from
Jul 25, 2022

Conversation

Jsiewierski11
Copy link
Contributor

Patch description
I have a new search method for the generator agent that I wanted to share. The new method, which I call nucleusbeam search provides more variation that traditional beam search while still maintaining coherence.
The seq2seq model can be igonored, that wasn't meant to be in the pull request.

@facebook-github-bot
Copy link

Hi @Jsiewierski11!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@klshuster
Copy link
Contributor

Thank you for your contribution! Could you please remove the seqseq model from the PR, if it is not meant to be there?

How would you feel about naming it DelayedNucleusBeamSearch, as it is quite similar to the DelayedBeamSearch except for Nucleus instead of Top-K?

@Jsiewierski11
Copy link
Contributor Author

I think that name is more fitting. Just removed the seqseq agent and changed the name.

@klshuster
Copy link
Contributor

Thanks! One more note - you'll most likely need to update the command line argument here to accept delayednucleusbeam as an option

@Jsiewierski11
Copy link
Contributor Author

Added the command line argument in my latest commit. Is there anything else that needs to be done before it can be merged? I am seeing that the long_gpu_tests are failing, do you know what might be causing that?

@stephenroller stephenroller changed the title New generation method [TGA] Delayed nucleus Jul 23, 2022
Copy link
Contributor

@klshuster klshuster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything looks good! the test failure is a known issue, I wouldn't worry about it - I'll go ahead and merge, thanks again for your contribution!

@klshuster klshuster merged commit 65528a0 into facebookresearch:main Jul 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants