-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sequence labeling refactoring #2361
Conversation
…moved batch loss average). running version, ready for PR.
…nce_labeling_refactoring � Conflicts: � flair/models/sequence_tagger_model.py
@whoisjones is there any status update on this? |
@helpmefindaname currently shifting the sequence labeler below the DefaultClassifier. We still need a parser for previous models, so it still takes some days, but feel free to contribute on this branch. |
…nce_labeling_refactoring
|
||
from .sequence_tagger_utils.crf import CRF | ||
from .sequence_tagger_utils.viterbi import ViterbiLoss, ViterbiDecoder | ||
from ..datasets import DataLoader, SentenceDataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why these relative module paths? why not flair.datasets
?
Sequence tagger speedups
pad_start_tags = torch.cat([start, tags], 1) | ||
pad_stop_tags = torch.cat([tags, stop], 1) | ||
# filter empty sentences | ||
if isinstance(sentences[0], Sentence): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the if check souldn't be required, as sentences is always of type List[Sentence]
if typing isn't violated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as yes, right! I'll push a correction, thanks!
for i in range(len(lens_)): | ||
pad_stop_tags[i, lens_[i] :] = self.tag_dictionary.get_idx_for_item(STOP_TAG) | ||
# order by length | ||
reordered_sentences: List[Union[Sentence, str]] = sorted(sentences, key=lambda s: len(s), reverse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Union doesn't make sense, as sentences is of type List[Sentence]
we will always have reordered_sentences: List[Sentence]
also, I think mypy is able to auto-infer the type of reordered_sentences
so the typing might not be necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also a good point, this is I think a leftover from times when str could also be passed to the predict function!
@whoisjones thanks a lot for improving this! |
Closes #2360.