Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't discard entity_group when token is the last in the sequence. #5439

Merged
merged 1 commit into from
Jul 1, 2020

Conversation

mfuntowicz
Copy link
Member

Signed-off-by: Morgan Funtowicz funtowiczmo@gmail.com

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
@mfuntowicz mfuntowicz force-pushed the ner-group-entities-last-token branch from 47c645e to 3f5a220 Compare July 1, 2020 15:56
@stefan-it
Copy link
Collaborator

LGTM!

Thanks @mfuntowicz !

Before:

In [6]: nlp("My name is Wolfgang and I live in Berlin")                                                                                                                                           
Out[6]: [{'entity_group': 'I-PER', 'score': 0.9991481900215149, 'word': 'Wolfgang'}]

With this PR:

In [5]: nlp("My name is Wolfgang and I live in Berlin")                                                                                                                                           
Out[5]: 
[{'entity_group': 'I-PER', 'score': 0.9991481900215149, 'word': 'Wolfgang'},
 {'entity_group': 'I-LOC', 'score': 0.9983668327331543, 'word': 'Berlin'}]

@mfuntowicz mfuntowicz changed the title Don't discard entity_group when token is the latest in the sequence. Don't discard entity_group when token is the last in the sequence. Jul 1, 2020
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mfuntowicz
Copy link
Member Author

@LysandreJik CI error seems unrelated, is it ok for you if I merge?

@mfuntowicz mfuntowicz merged commit f4323db into master Jul 1, 2020
@mfuntowicz mfuntowicz deleted the ner-group-entities-last-token branch July 1, 2020 18:30
@julien-c
Copy link
Member

julien-c commented Jul 3, 2020

Did you check this, @enzoampil? Just making sure to ping you as you contributed #3957 🤗

@enzoampil
Copy link
Contributor

enzoampil commented Jul 4, 2020

@julien-c Did a few checks as well and looks great! Was planning to include this in this PR #4987 (2nd point), but this seems to solve it cleanly already, so will consider this fix for that PR 😄

UPDATE: Ended up modifying this fix in the PR above, due to cases where the last token was repeating (for the test cases set in the above PR).

enzoampil added a commit to enzoampil/transformers that referenced this pull request Jul 4, 2020
enzoampil added a commit to enzoampil/transformers that referenced this pull request Jul 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants