Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct and clarify the handling of empty/zero-length Docs during training and inference #365

Merged
merged 7 commits into from
Jan 30, 2023

Conversation

shadeMe
Copy link
Collaborator

@shadeMe shadeMe commented Jan 25, 2023

Description

This PR addresses two issues:

  • When zero-length inputs passed to the transformer pipe, the outputs yielded by an attached listener has the wrong shape in the representation width dimension. This is more of a question of correctness than a practical concern due to the lack of tokens in such Docs, but we explicitly clarify why we use zero-width outputs.
  • Backprop callbacks and gradients were not correctly aligned to their corresponding model outputs (outputs from the transformer pipe) when empty documents occurred in the middle of a training batch. The existing tests for this corner-case were inadvertently passing due to the behaviour of zip, i.e., the test always added the empty document to the end of the batch, and the subsequent zip operation would implicitly skip the last backprop/gradient due to the latter list being shorter than the list of model outputs.

It also adds comments about the pre-conditions to clarify the circumstances under which zero-length inputs are passed to the model.

Types of change

Bug Fix

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@shadeMe shadeMe marked this pull request as draft January 25, 2023 17:31
@shadeMe shadeMe added the bug Something isn't working label Jan 26, 2023
@shadeMe shadeMe changed the title Allocate zeroed outputs with the correct width whenever possible Correct the handling of empty/zero-length docs during training and inference Jan 26, 2023
@shadeMe shadeMe changed the title Correct the handling of empty/zero-length docs during training and inference Correct the handling of empty/zero-length Docs during training and inference Jan 26, 2023
@shadeMe shadeMe marked this pull request as ready for review January 26, 2023 10:16
@shadeMe shadeMe added the feat / pipeline Feature: Pipeline components label Jan 27, 2023
spacy_transformers/layers/trfs2arrays.py Outdated Show resolved Hide resolved
spacy_transformers/layers/trfs2arrays.py Outdated Show resolved Hide resolved
@shadeMe shadeMe changed the title Correct the handling of empty/zero-length Docs during training and inference Correct and clarify the handling of empty/zero-length Docs during training and inference Jan 30, 2023
@danieldk danieldk self-requested a review January 30, 2023 11:27
@danieldk
Copy link
Contributor

Assigned to you Sofie, since you added the handling of zero-length docs, so it would be good get an additional sanity check.

Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice digging, looks great!

spacy_transformers/layers/trfs2arrays.py Outdated Show resolved Hide resolved
@svlandeg svlandeg merged commit f8bb75d into explosion:master Jan 30, 2023
@shadeMe shadeMe deleted the fix/zero-length-outputs branch January 30, 2023 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feat / pipeline Feature: Pipeline components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants