Support for Multitask Learning in Flair #2910

alanakbik · 2022-08-17T07:06:45Z

This PR adds support for multitask learning in Flair (closes #2508 and closes #1260) with hopefully a simple syntax to define multiple tasks that share parts of the model.

The most common part to share is the transformer, which you might want to fine-tune across several tasks. Instantiate a transformer embedding and pass it to two separate models that you instantiate as before:

# --- Embeddings that are shared by both models --- #
shared_embedding = TransformerDocumentEmbeddings("distilbert-base-uncased", fine_tune=True)

# --- Task 1: Sentiment Analysis (5-class) --- #
corpus_1 = SENTEVAL_SST_GRANULAR()

model_1 = TextClassifier(shared_embedding,
                         label_dictionary=corpus_1.make_label_dictionary("class"),
                         label_type="class")

# -- Task 2: Binary Sentiment Analysis on Customer Reviews -- #
corpus_2 = SENTEVAL_CR()

model_2 = TextClassifier(shared_embedding,
                         label_dictionary=corpus_2.make_label_dictionary("sentiment"),
                         label_type="sentiment",
                         )

# -- Define mapping (which tagger should train on which model) -- #
multitask_model, multicorpus = make_multitask_model_and_corpus(
    [
        (model_1, corpus_1),
        (model_2, corpus_2),
    ]
)

# -- Create model trainer and train -- #
trainer = ModelTrainer(multitask_model, multicorpus)
trainer.fine_tune(f"resources/taggers/multitask_test")

The mapping part here defines which tagger should be trained on which corpus. By calling make_multitask_model_and_corpus with a mapping, you get a corpus and model object that you can train as before.

In addition, this PR makes some experimental changes that will likely be adapted further:

adds support for gradient reversal to all models inheriting from DefaultClassifier
changes loss reduction from 'mean' to 'sum' in DefaultClassifier
removes the loss calculation in the final test in training

Edit: In addition, this PR removes some model classes that were very beta: the DependencyParser, the DistancePredictor and the SimilarityLearner. These may get added back when the model interfaces are finalized.

…nostics

whoisjones and others added 23 commits July 14, 2022 14:01

multitask training

03be003

inverse model added, fix logging for final model eval

6323d9c

New ConcatDataset class

24d6740

Modify multitask syntax

2ab7cc8

Black formatting

2652ffd

GH-2882: potential fix

6eeba29

add inversion strategies

583a9d4

Fix gradient reversal

7e8bcdb

Remove unused code

0e7e26d

Black formatting

2699a06

Undo fix

18e47ae

Merge branch 'master' into multitask_mod

c631d13

Merge branch 'master' into multitask_mod

fac3d8e

Experiment with option to add a loss_factor

7243572

Remove final loss calculation in trainer

15b298c

Experiment with different reduction

f5ead1c

Various refactorings on multitask

1dc9432

Refactor loss factor logic

417b288

Make mypy happy

bda0092

make mypy happy | remove DependencyParser, SimilarityLearner and diag…

92c93c5

…nostics

make mypy happy

a86e7c7

Loss on correct tensor

aad2b68

Update evaluate() signature in TextRegressor

6cc82a0

alanakbik merged commit 4045942 into master Aug 17, 2022

alanakbik deleted the multitask_mod branch August 17, 2022 11:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Multitask Learning in Flair #2910

Support for Multitask Learning in Flair #2910

alanakbik commented Aug 17, 2022 •

edited

Loading

Support for Multitask Learning in Flair #2910

Support for Multitask Learning in Flair #2910

Conversation

alanakbik commented Aug 17, 2022 • edited Loading

alanakbik commented Aug 17, 2022 •

edited

Loading