gh-3097: fix multitask model training #3101
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes #3097 by introducing an
evaluate_all
parameter. That way, users can usemultitask_model.evaluate(corpus.test, gold_label_type="Task_0", evaluate_all=False)
to evaluate the first task. Ifevaluate_all
is set to true, the value ofgold_label_type
will be ignored and every task will be evaluated on their respective label type.While doing this implementation I also noticed, that the evaluation currently evaluates sentences that have multiple tasks annotated to be evaluated for only one random task. -> now the evaluation uses all sentences for all tasks they are assigned to as it should be expected.
I also added the possibilities to compute the loss of a sentence on all assigned tasks during training. Hence Training Knowledge Graph Construction (NER + Relation Extraction + NEL) with a shared transformer embedding, would lead to only slightly increased training time compared to NER only, while leaveraging information of all 3 tasks at any time.
I found a bug where the embeddigns where not embedded right, as the
identify_dynamic_embeddings
only looks at the first sentence of the batch which could be unenbedded (e.g. for relation extraction or NEL but has no NER labels). Now the logic searches the whole batch and continues searching until it finds a batch that contains ANY embedding.