-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Region cropping problem #27
Conversation
ocrd_anybaseocr/ocrd-tool.json
Outdated
@@ -76,7 +76,7 @@ | |||
"output_file_grp": ["OCR-D-IMG-DEWARP"], | |||
"parameters": { | |||
"imgresize": { "type": "string", "default": "resize_and_crop", "description": "run on original size image"}, | |||
"pix2pixHD": { "type": "string", "default":"/home/ahmed/project/pix2pixHD", "description": "Path to pix2pixHD library"}, | |||
"pix2pixHD": { "type": "string", "default":"/home/jenckel/pix2pixHD/pix2pixHD", "description": "Path to pix2pixHD library"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a good default. Please provide models via setup package_data
or subrepo and a makefile rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it really good practice, to link multiple 100 MB models in the setup?
I'd prefer to provide a link to the sources in the README and output a warning/error if they are missing at runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it really good practice, to link multiple 100 MB models in the setup?
No, of course not. But a subrepo for that (which you can link to with relative paths in your README and tool json default etc) is good practise.
I'd prefer to provide a link to the sources in the README and output a warning/error if they are missing at runtime
Yes, that's fine. (This comment was about the absolute pathname which appeared in the original version)
@@ -136,10 +143,6 @@ def _process_segment(self,page_image, page, page_xywh, page_id, input_file, n): | |||
file_grp=self.image_grp | |||
) | |||
page.add_AlternativeImage(AlternativeImageType(filename=file_path, comments=page_xywh['features'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this processor do? Provide a derived image of the page with colors/frames for segments? That's not what segmentation in a PAGE-XML annotation workflow looks like. You need to add regions (and remove existing regions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have opened #31 to address this
points = region.Coords.get_points() | ||
points = points.split(" ") | ||
|
||
x_min = min(int(points[0].split(",")[0]), int(points[1].split(",")[0]), int(points[2].split(",")[0]), int(points[3].split(",")[0])) | ||
x_max = max(int(points[0].split(",")[0]), int(points[1].split(",")[0]), int(points[2].split(",")[0]), int(points[3].split(",")[0])) | ||
y_min = min(int(points[0].split(",")[1]), int(points[1].split(",")[1]), int(points[2].split(",")[1]), int(points[3].split(",")[1])) | ||
y_max = max(int(points[0].split(",")[1]), int(points[1].split(",")[1]), int(points[2].split(",")[1]), int(points[3].split(",")[1])) | ||
|
||
if x_max>page_image.size[0]: | ||
x_max = page_image.size[0]-1 | ||
if y_max>page_image.size[1]: | ||
y_max = page_image.size[1]-1 | ||
|
||
img__ = page_image.crop((x_min,y_min,x_max,y_max)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is totally inadequate. What if the region already has alternative images? What if it has @orientation
(from deskewing)?
The region_xywh
you are passing to _process_segment
along with img__
does not match. It will give wrong coordinates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What made you bypass image_from_segment
's region_image
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem was exactly that it already had an alternative image
we tried to "merge" the following two pipelines:
raw image --> binarization --> deskew --> cropping --> tiseg ---> textline
with
raw image --> block segmentation --(region+alternative image)--> textline
per default block segmentation adds an alternative image for each extracted region which means textline will take the alternative image from the raw image over applying the extracted regions to the binarized/deskewed etc. image (both of which textline needs)
We are aware that the pipeline was not supposed to have multiple branches, but with the dependencies of our methods there is no way for a single straight forward pipeline, so we tried to use the API as best as possible.
we ended up with three choices:
either uncouple the pipelines and perform: raw image --> block segmentation --> binarization --> deskew --> textline
independently, apply the regions manually (which we tried to do) or not add an alternative image during block segmentation (which may be better?)
What do you think would be the best option? (We tried to explain the problem before, and I hope its understandable now, but communicating the problem turned out to be difficult.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per default block segmentation adds an alternative image for each extracted region which means textline will take the alternative image from the raw image over applying the extracted regions to the binarized/deskewed etc. image (both of which textline needs)
This is not a valid sentence in English, and I don't understand it.
We are aware that the pipeline was not supposed to have multiple branches, but with the dependencies of our methods there is no way for a single straight forward pipeline, so we tried to use the API as best as possible.
What the API can give you are filters and selector on operations already performed earlier in the workflow. It cannot provide the workflow itself for a processor! (That's up to the user to decide. But of course, you can document what a good setup is for your tools.)
However, the processor can constrain its input images, so it does not pick up the wrong annotations when there is a freedom of choice, and stops execution when put in the wrong (place of a) pipeline.
So I don't see a problem with the above workflow. Just that your textline segmentation needs 2 input file groups then (to merge their regions I guess).
You could even be very restrictive to prevent other workflows:
- binarization: filter
binarized,deskewed,cropped
page images - deskewing: select
binarized
, filterdeskewed,cropped
page images - cropping: select
binarized,deskewed
, filtercropped
page images - tiseg: select
binarized,deskewed,cropped
page images - block segmentation: filter
binarized,deskewed,cropped
page images - textline: select
binarized,deskewed,cropped
region images on input file group 1, filterbinarized,deskewed,cropped
region images on input file group 2
But you don't need to do this (and I would recommend against it, except where an algorithm just cannot handle the input otherwise).
We tried to explain the problem before, and I hope its understandable now, but communicating the problem turned out to be difficult.
You can always turn to the Gitter Lobby for general questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
region_xywh
you are passing to_process_segment
along withimg__
does not match. It will give wrong coordinates.
What made you bypass
image_from_segment
'sregion_image
?
This has been resolved by 13d855e (although the wrong method still remains as commented code).
But the rest of the discussion was about filters/selectors. You took my advice to the extreme in 0ac8567 – hence my comment below
Co-Authored-By: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Co-Authored-By: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Co-Authored-By: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Co-Authored-By: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Co-Authored-By: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Regarding the CI failures, I believe you should try to update/renew your CircleCI build cache. Also adding some
|
…/ocrd_anybaseocr into region_cropping_problem reviewed changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the problem when no border
is annotated, I still recommend against making your processors that strict (maximally filtering/selecting image features). This should mostly be up to workflow configuration (for which you could give hints and recommendations in the README and tool json descriptions).
Is it save to merge a PR for which the continuous integration tests failed? Is it save for users to update their working installation? |
problem was related to circleci not finding tensorflow 2.0 (as already mentioned by @bertsky) |
There are so many unresolved comments above – it's not clear what additional commits address which problems. But I believe quite a few errors remain. @khurramHashmi You make it extremely hard to collaborate. |
This branch is ready to merge with master branch. Please test first and then merge.