-
-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Added functionality for taking screenshot of original/raw page prior to tagging. Added functionality for combining the OCR annotations of the original/raw page and the tagged page. #95
Conversation
…r to tagging. Added functionality for combining the OCR annotations of the original/raw page and the tagged page.
…to be compatible with supertype ITarsier.
tarsier/core.py
Outdated
@@ -77,3 +98,15 @@ async def _remove_tags(adapter: BrowserAdapter) -> None: | |||
script = "return window.removeTags();" | |||
|
|||
await adapter.run_js(script) | |||
|
|||
@staticmethod | |||
async def _hide_non_tag_elem(adapter: BrowserAdapter) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should maybe just use the same name as TS (Should be done for all similar methods)
async def _hide_non_tag_elem(adapter: BrowserAdapter) -> None: | |
async def _hide_non_tag_elements(adapter: BrowserAdapter) -> None: |
tarsier/core.py
Outdated
) -> Tuple[bytes, bytes, Dict[int, str]]: | ||
adapter = adapter_factory(driver) | ||
initial_screenshot = await self._take_screenshot(adapter) | ||
tag_to_xpath = ( | ||
await self._tag_page(adapter, tag_text_elements) if not tagless else {} | ||
) | ||
screenshot = await self._take_screenshot(adapter) | ||
await self._hide_non_tag_elem(adapter) | ||
tagged_screenshot = await self._take_screenshot(adapter) | ||
await self._revert_visibilities(adapter) | ||
if not tagless: | ||
await self._remove_tags(adapter) | ||
return screenshot, tag_to_xpath if not tagless else {} | ||
return ( | ||
initial_screenshot, | ||
tagged_screenshot, | ||
tag_to_xpath if not tagless else {}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's maybe keep this method the same and just move over logic to page_to_text.
This is because anyone using page_to_image
currently expects a single image with all of the tagged elements (Alongside the page text). We also avoid changing the function signature this way
tarsier/core.py
Outdated
combined_annotations: ImageAnnotatorResponse = { | ||
"words": untagged_ocr_annotations["words"] + tagged_ocr_annotations["words"] | ||
} | ||
combined_annotations["words"] = list( | ||
sorted( | ||
combined_annotations["words"], | ||
key=lambda x: ( | ||
x["midpoint_normalized"][1], | ||
x["midpoint_normalized"][0], | ||
), | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should maybe make a combine_annotations() method that handles this resorting (To decouple this function having to care that its sorted)
…ase tagging from page_to_image() to page_to_text(), added method combine_annotations() to decouple the sorting logic from page_to_text().
No description provided.