You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This seems counterintuitive as the grouped entities shouldn't be fragmented by subwords, and ignoring subwords shouldn't be conditioned on grouping entitities.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Environment info
transformers
version: 4.4.1Who can help
Library:
Information
Model I am using (Bert, XLNet ...):
Any NER model, e.g. elastic/distilbert-base-cased-finetuned-conll03-english
The problem arises when using:
The tasks I am working on is:
Ignoring subwords using the TokenClassificationPipeline.
To reproduce
Steps to reproduce the behavior:
This outputs:
Expected behavior
The expected behavior would be the subwords token being merged with the preceding token, and their predictions ignored e.g.
instead of
In the current logic the flag
ignore_subwords
seems to be used only in combination with thegrouped_entities
https://github.com/huggingface/transformers/blob/master/src/transformers/pipelines/token_classification.py#L216 . The output obtained from the example input above, setting both flags as True:while setting
grouped_entities=True
andignore_subwords=False
outputsThis seems counterintuitive as the grouped entities shouldn't be fragmented by subwords, and ignoring subwords shouldn't be conditioned on grouping entitities.
The text was updated successfully, but these errors were encountered: