Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing pass of is_training to encode_input #108

Merged
merged 3 commits into from
Mar 7, 2022

Conversation

ArneBinder
Copy link
Owner

@ArneBinder ArneBinder commented Mar 6, 2022

This was missed in #102.

EDIT: This rearranges the tested tokens in test_encode_input and test_collate to increase the readability of these tests.

@ArneBinder ArneBinder changed the title Fix missing pass of is_training to encode_input [WIP] Fix missing pass of is_training to encode_input Mar 6, 2022
@ArneBinder
Copy link
Owner Author

ArneBinder commented Mar 6, 2022

Do we also want to relax within this PR that the code does not fail if some entity, sentence or relation annotation layer is not available? Maybe just printing some warnings in these cases, e.g. "2 out of 200 documents have no relation annotations" in the case of RE or "5 of 200 documents have no sentence annotations, they will be discarded" when partition_annotation=sentences is used?

EDIT: I think this may need some discussion and should be handled in a separate PR. Related issue: #109

@ArneBinder ArneBinder changed the title [WIP] Fix missing pass of is_training to encode_input Fix missing pass of is_training to encode_input Mar 6, 2022
@ArneBinder ArneBinder requested a review from ChristophAlt March 6, 2022 15:48
@ArneBinder ArneBinder added the bug Something isn't working label Mar 6, 2022
@@ -505,8 +641,9 @@ def test_unbatch_output(prepared_taskmodule, model_output):

@pytest.mark.parametrize("inplace", [False, True])
def test_decode(prepared_taskmodule, documents, model_output, inplace):
encodings = prepared_taskmodule.encode(documents, encode_target=False)
encodings = prepared_taskmodule.encode(documents, encode_target=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change?

Copy link
Owner Author

@ArneBinder ArneBinder Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because otherwise we would need to change the remaining of this test since encode produces now 10 instead of 2 encodings when enocde_target=False (all combinations of entities). I know, it would be better to decouple the test and do not call encode here at all, but I had not yet the energy to define simple encodings manually (which then could be passed to decode, the actual method to test here).

Copy link
Owner Author

@ArneBinder ArneBinder Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the same holds for test_encode_target. I created an issue for both cases: #110.

@ChristophAlt ChristophAlt merged commit 2eb6a9e into main Mar 7, 2022
@ChristophAlt ChristophAlt deleted the fix/encode_input_is_training branch April 17, 2022 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants