Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequence labeling refactoring #2361

Merged
merged 69 commits into from
Dec 16, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
dd4adde
added utils folder and CRF class
whoisjones Jul 31, 2021
0419de4
added Viterbi classes
whoisjones Jul 31, 2021
95b9fa3
added the sequence labeling utils.py containing required mathematical…
whoisjones Jul 31, 2021
51d6c32
initial commit refactored sequence tagger
whoisjones Jul 31, 2021
eb3e0a3
added start / stop tags
whoisjones Jul 31, 2021
822c79e
initial adjustments to new Classifier abstract class
whoisjones Jul 31, 2021
f292931
added get sequence tensor method
whoisjones Jul 31, 2021
79ed883
added required label property for new Classifier class interface
whoisjones Jul 31, 2021
7ad6380
adjust predict method to new Classifier interface
whoisjones Jul 31, 2021
2f65eb9
initial running version
whoisjones Jul 31, 2021
3aba2e9
changes to loss function (loss averaging for trainer) and viterbi (re…
whoisjones Jul 31, 2021
e396a2c
fix integration test erros.
whoisjones Aug 3, 2021
6d41a9b
remove testing file.
whoisjones Aug 3, 2021
9bc7fbb
Compatibility changes
alanakbik Aug 17, 2021
669e2f2
Merge branch 'master' of https://github.com/flairNLP/flair into seque…
whoisjones Nov 12, 2021
d8135ba
merge
whoisjones Nov 12, 2021
b8004e4
adjustments for DefaultClassifier
whoisjones Nov 15, 2021
7106703
sequence tagger adaption for DefaultClassifier
whoisjones Nov 16, 2021
ec700eb
inference with ViterbiDecoder
whoisjones Nov 17, 2021
51ab3a6
Viterbi target formatting
whoisjones Nov 17, 2021
abd08bb
fix loss logging after each epoch
whoisjones Nov 17, 2021
93b5241
fix load and save method
whoisjones Nov 17, 2021
e2bf379
Merge branch 'master' of https://github.com/flairNLP/flair into seque…
whoisjones Nov 17, 2021
013e484
fix init method in order to load models with previous SequenceTagger
whoisjones Nov 19, 2021
17f6299
added store_embeddings to predict in order to save memory
whoisjones Nov 19, 2021
68a9138
adjustments for linear layer into tag space
whoisjones Nov 21, 2021
ff98d7d
fix linear layer into tag space predictions
whoisjones Nov 21, 2021
8fc9237
fix sequence labeling without crf predictions
whoisjones Nov 22, 2021
8f2c301
final refactorings
whoisjones Nov 22, 2021
fbe6372
Merge branch 'master' of https://github.com/flairNLP/flair into seque…
whoisjones Nov 22, 2021
013c07a
changed to torch.tensor
whoisjones Nov 24, 2021
f6e1f1d
fixes predict() method
whoisjones Nov 29, 2021
569c089
fixes sequence tagger class
whoisjones Nov 30, 2021
c01c1c8
fixes crf and decoder
whoisjones Nov 30, 2021
7913595
fixes crf and decoder
whoisjones Nov 30, 2021
0e44807
fixes crf and decoder
whoisjones Nov 30, 2021
870cdee
fixes viterbi
whoisjones Nov 30, 2021
05146d1
fixes viterbi
whoisjones Nov 30, 2021
86660cb
put tensors to cuda
whoisjones Nov 30, 2021
4452ec8
adjustments to sequence labeler
whoisjones Dec 1, 2021
a6a228d
adjustments viterbi
whoisjones Dec 1, 2021
cc1ffd8
sequence tagger adjustments
whoisjones Dec 2, 2021
6ff295e
viterbi adjustments
whoisjones Dec 2, 2021
cfcd2b2
crf fixes
whoisjones Dec 7, 2021
a15b66c
adpation of viterbi inference + crf scores for preivously trained models
whoisjones Dec 7, 2021
d6a075b
CRF working, fixing batching issues
whoisjones Dec 8, 2021
9fae78a
change transitions to be on CPU if using CUDA
whoisjones Dec 8, 2021
c01dc1d
transitions not always on same device if using CUDA
whoisjones Dec 8, 2021
0e54fa2
transitions not always on same device if using CUDA
whoisjones Dec 8, 2021
e81f7c6
fix: initialize transitions
whoisjones Dec 9, 2021
b6cedd1
fix: standard inference via softmax
whoisjones Dec 10, 2021
c3b3f5c
SequenceTagger documentation
whoisjones Dec 10, 2021
89d9f34
refactorings
whoisjones Dec 10, 2021
3acdbc0
refactorings
whoisjones Dec 10, 2021
7de0492
try different sentence tensor method
alanakbik Dec 14, 2021
46f082e
use different tensor creation method
alanakbik Dec 14, 2021
335b082
Merge pull request #2550 from flairNLP/sequence_tagger_speedups
whoisjones Dec 14, 2021
02e3a61
Merge branch 'master' into sequence_labeling_refactoring
alanakbik Dec 14, 2021
eacf136
Merge branch 'master' into sequence_labeling_refactoring
alanakbik Dec 15, 2021
312fb1c
Fix merge errors
alanakbik Dec 15, 2021
cea8a31
update formatting to 120 length
alanakbik Dec 15, 2021
9e7efe4
Update instructions for formatting
alanakbik Dec 15, 2021
3837e72
Fix empty sentence error
alanakbik Dec 15, 2021
9190c06
Remove unnecessary if-check
alanakbik Dec 15, 2021
3386533
Remove typing
alanakbik Dec 15, 2021
e637792
Unified final linear map
alanakbik Dec 15, 2021
6cd3e5a
Inherit from Classifier
alanakbik Dec 16, 2021
f13fab3
Undo error caused by moving _get_gold_labels out
alanakbik Dec 16, 2021
2a0eee5
Black formatting
alanakbik Dec 16, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ In general, it is recommended to ensure all basic tests are running through befo
To ensure a standardized code style we use the formatter [black](https://github.com/ambv/black) and for standardizing imports we use [isort](https://github.com/PyCQA/isort).
If your code is not formatted properly, the tests will fail.
simply execute
You can automatically format the code via `black . && isort .`
You can automatically format the code via `black --config pyproject.toml flair/ && isort flair/` in the flair root folder.

### pre-commit hook

Expand Down
10 changes: 10 additions & 0 deletions flair/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,16 @@ def __len__(self) -> int:
def get_item_for_index(self, idx):
return self.idx2item[idx].decode("UTF-8")

def set_start_stop_tags(self):
self.add_item("<START>")
self.add_item("<STOP>")

def start_stop_tags_are_set(self):
if {"<START>".encode(), "<STOP>".encode()}.issubset(self.item2idx.keys()):
return True
else:
return False

def save(self, savefile):
import pickle

Expand Down
Loading