Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vb/document entity annotation al 5106 #987

Merged
merged 8 commits into from
Mar 14, 2023

Conversation

vbrodsky
Copy link
Contributor

Story: https://labelbox.atlassian.net/browse/AL-5106
Adding support for DocumentEntity proper annotation type

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@vbrodsky vbrodsky force-pushed the VB/document-entity-annotation_AL-5106 branch 2 times, most recently from 482fa99 to 8200151 Compare March 12, 2023 22:03
@vbrodsky vbrodsky force-pushed the VB/document-entity-annotation_AL-5106 branch from 6d2f6cc to 5a2c7ca Compare March 12, 2023 23:15
labelbox/data/annotation_types/ner/document_entity.py Outdated Show resolved Hide resolved
tests/integration/annotation_import/conftest.py Outdated Show resolved Hide resolved
@pytest.fixture
def configured_project_pdf_entity(client, ontology, rand_gen,
dataset_pdf_entity):
project = client.create_project(name=rand_gen(str),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we reuse configured_project_without_data_rows fixture here? Also, we should prefer QueueMode.Batch over QueueMode.Dataset, QueueMode.Dataset is an older paradigm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will look into both

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done

@vbrodsky vbrodsky force-pushed the VB/document-entity-annotation_AL-5106 branch 4 times, most recently from e5279df to cef16d0 Compare March 13, 2023 21:09
tests/integration/conftest.py Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that we're getting such a large diff. Maybe some of our commits don't have databooks running, perhaps we should add databooks as a CI job too. No action needed for now, I've made a note for myself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also maybe the reason is that I ran my notebook via a jupyter server this time, and not VSC? Perhaps we should all standardize our notebook runtime

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thats a good idea, I've noticed using jupyter creates huge diffs compared to vscode

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may also be because someone working on this notebook did not have the pre-commit hooks installed.

Copy link
Contributor Author

@vbrodsky vbrodsky Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally possible as Andrea has updated this notebook

labelbox/data/annotation_types/ner/document_entity.py Outdated Show resolved Hide resolved
@whistler
Copy link
Contributor

Should we also have NDJSON serialization tests similar to other examples under tests/data/serialization/ndjson?

@vbrodsky
Copy link
Contributor Author

Should we also have NDJSON serialization tests similar to other examples under tests/data/serialization/ndjson?

good idea

def to_common(self) -> DocumentEntity:
return TextEntity(name=self.name, text_selections=self.text_selections)

return obj.from_common(annotation.value, subclasses, annotation.name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realized that there are two returns here, would the second return ever be called?

text_selections: List[DocumentTextSelection]

def to_common(self) -> DocumentEntity:
return TextEntity(name=self.name, text_selections=self.text_selections)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should TextEntity be DocumentEntity since that's the return type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thats a good idea, I've noticed using jupyter creates huge diffs compared to vscode

@vbrodsky vbrodsky force-pushed the VB/document-entity-annotation_AL-5106 branch 2 times, most recently from b031fce to 3ed0e4e Compare March 14, 2023 19:07
@vbrodsky vbrodsky force-pushed the VB/document-entity-annotation_AL-5106 branch from 3ed0e4e to 59b060c Compare March 14, 2023 19:11
@vbrodsky vbrodsky force-pushed the VB/document-entity-annotation_AL-5106 branch from 59b060c to 0b4590f Compare March 14, 2023 19:36
Copy link
Contributor

@whistler whistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks for the updates!

@vbrodsky vbrodsky merged commit 3be870d into develop Mar 14, 2023
@vbrodsky vbrodsky deleted the VB/document-entity-annotation_AL-5106 branch March 14, 2023 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants