Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

[BART] Do not add start/end tokens multiple times #3714

Merged
merged 2 commits into from
Jun 15, 2021
Merged

Conversation

emilydinan
Copy link
Contributor

Patch description
Found an issue with BART in which -- if we cache the text_vec -- start/end tokens are added multiple times. I was using this caching behavior to do "scoring" of candidates with BART, when we had too many candidates to fit in memory.

I adopted the same solution we use for BERT:

def _set_text_vec(self, *args, **kwargs):

CC @adamlerer

Testing steps
I added a test.

Copy link
Contributor

@klshuster klshuster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems harmless enough

@emilydinan emilydinan merged commit 9f9121d into master Jun 15, 2021
@emilydinan emilydinan deleted the bartstart branch June 15, 2021 14:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants