-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements for working with tiled segmentations #248
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CPBridge
changed the title
TILED_FULL and auto-tiliing
Improvements for working with tiled segmentations
Sep 11, 2023
[like] Schacherer, Daniela reacted to your message:
…________________________________
From: Chris Bridge ***@***.***>
Sent: Monday, September 11, 2023 10:46:21 PM
To: ImagingDataCommons/highdicom ***@***.***>
Cc: Schacherer, Daniela ***@***.***>; Mention ***@***.***>
Subject: [ImagingDataCommons/highdicom] Improvements for working with tiled segmentations (PR #248)
A major set of improvements for working with segmentations of TILED images:
* Add support for passing a total pixel matrix of a segmentation to the constructor. Previously, the user had to handle the process of creating the tiles from a total pixel matrix themselves as well as creating the plane positions that go along with this. This process is tedious, error-prone, and computationally inefficient especially with very large arrays. This PR add a tile_input_array boolean parameter. If it is True, the user passes a single frame to the constructor representing the full total pixel matrix, and highdicom performs the tiling automatically. The user may specify a tile_size, otherwise it will be copied from the source image. This process avoids unnecessarily reshaping the input array and should therefore also be considerably more efficient than reshaping to a stack of tiled frames before compressing/storing.
* Add support for the TILED_FULL dimension organization type upon construction. The user can specify the dimension_organization_type as TILED_FULL and, if the requirements are met for this to be possible, the segmentation will be stored in this way. Note that I anticipate this primarily being used with the aforementioned tile_input_array option, though that is not a requirement. The previous behaviour was to use TILED_SPARSE (although this was not actually stored in the dataset) and store the frames in an order that was different to that required for TILED_FULL. Given comments in @hackermd<https://github.com/hackermd> 's existing code, I believe this was a mistake and that the intention was that frames should always have been stored in an order that matches TILED_FULL. Therefore I chose to change the order of frames for all segmentations to match that required by TILED_FULL regardless of whether the segmentation is actually stored as TILED_FULL. I suppose you could consider this a backwards incompatible change, although hopefully no-one was relying on the frame ordering in a non TILED_FULL. Also note that a clarification of the standard regarding where segment number should live in the TILED_FULL frame order has been prepared by @dclunie<https://github.com/dclunie> and I have followed this convention (segment number varies more slowly than row and column position).
* Add a new method get_total_pixel_matrix to the segmentation object. This returns the segmentation array of a segmentation object as the total pixel matrix rather than a stack of frames. It works for both TILED_FULL and TILED_SPARSE (or unspecified dimension organization type). Furthermore, the user can specify a rectangular sub-region of the total pixel matrix rather than the full array. Otherwise, behaviour is similar to the existing methods for accessing the segmentation mask (the user may specify segments of interest and use the combine_segments option to have a labelmap style array returned to them). I implemented this by generalizing the underlying efficient segmentation "reconstruction" routine so most of the code is shared with the implementation of the existing methods.
* In order to achieve the above there were several changes to the functions in the utils module. Previously the compute_plane_position_slide_per_frame function returned a list of PlanePositionObjects but for the tasks above it is inefficient to do this as we just want the underlying values stored in the objects. Therefore that method has been refactored to call another public function (iter_tiled_full_frame_data) that just yields the underlying data (plane positions and x/y/z offsets).
* Documentation and tests for all of the above, including new sections in the Segmentation page of the user guide.
Note that currently this PR contains all of the contents of #245<#245> as I branched from there.
@fedorov<https://github.com/fedorov> @dclunie<https://github.com/dclunie> @DanielaSchacherer<https://github.com/DanielaSchacherer> @curtislisle<https://github.com/curtislisle>
________________________________
You can view, comment on, or merge this pull request online at:
#248
Commit Summary
* 378c740<378c740> Add multithreading option to segmentation creation
* be6963b<be6963b> Remove array copy, seems to be unnecessary
* 808af48<808af48> Add test
* 9934170<9934170> Add section to docs on multiprocessing
* 08a82ba<08a82ba> Retrieve one segment array at a time: simpler and slightly faster
* 47286dd<47286dd> Fix docstring for _get_segment_array
* 22e9024<22e9024> add ability for user to pass their own process pool
* 2aee263<2aee263> Add basic tiled_full option
* 29e45bf<29e45bf> Fixes
* 74d5e83<74d5e83> Switch ordering of tiles to actually match TILED_FULL
* cde407d<cde407d> Refactor utils code into iter_tiled_full_frame_data
* 8f71961<8f71961> Fix types
* 7bbdc26<7bbdc26> Fix tests relating to order of dimension indices
* b44ff78<b44ff78> Basic implementation of autotiling. Need to add omitting frame logic
* 33f635e<33f635e> Add frame omission logic and tests
* f15aa1a<f15aa1a> Work in progress implementation of constructing tiled outputs
* 6109eed<6109eed> Working implementation of tiled outputs, need tests
* c50614f<c50614f> Add tests, fix bug with padded tiles
* 402371a<402371a> Add doc sections on total pixel matrices for seg creation and parsing
* f1a284b<f1a284b> Add tiled full test image, links in docs
File Changes
(10 files<https://github.com/ImagingDataCommons/highdicom/pull/248/files>)
* M data/test_files/seg_image_sm_control.dcm<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-9bbb3fa19250ffc5249ae4a52711457ccf3a185b20cb274b03f5e4399ef08304> (0)
* A data/test_files/seg_image_sm_dots_tiled_full.dcm<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-0c837ade48010958f211304a36370bb966dd676e9263a42c513d87415f6dd641> (0)
* M docs/seg.rst<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-666b1f941dd7c1ef6f0ca73477d5beae5b4db4bb60b2522168d3f76e47180c6d> (235)
* M src/highdicom/content.py<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-0fecc892b84534f041a43d6edf3d85b0fe9521d6dd8d2369c26f67177e2cc4ec> (39)
* M src/highdicom/seg/content.py<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-bcc4271467fbf497ab022cf6ef97362c53369c31d4740a666d4d8b2c5db424be> (28)
* M src/highdicom/seg/sop.py<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-0c88a539cb61a359ba6bc714ff7222138ffb09ba6cbb3474f732dc01299eee73> (1608)
* M src/highdicom/utils.py<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-3f0d84cb9cd39915f25355d2093ec6f90691872a3fcb3f228147e8581ad65c2c> (382)
* M tests/test_ann.py<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-eb9fe7ab9a93c7126ea75284b0ee4bd18f9877eaa6839cb87b7bd0c58ae710ca> (5)
* M tests/test_seg.py<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-90f5f84d2267e72c000d904b0da496442ab565e8a6d0d02574dde73d59598782> (275)
* M tests/test_utils.py<https://github.com/ImagingDataCommons/highdicom/pull/248/files#diff-33c13e0b177bacd2f02e29bcb8aea5b49e7ce34901fd8f41fefb65defba1bd33> (66)
Patch Links:
* https://github.com/ImagingDataCommons/highdicom/pull/248.patch
* https://github.com/ImagingDataCommons/highdicom/pull/248.diff
—
Reply to this email directly, view it on GitHub<#248>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACVDZQORSA7CTDIBZGXU22LXZ6ID3ANCNFSM6AAAAAA4EU3RUY>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A major set of improvements for working with segmentations of TILED images:
tile_input_array
boolean parameter. If it is True, the user passes a single frame to the constructor representing the full total pixel matrix, and highdicom performs the tiling automatically. The user may specify atile_size
, otherwise it will be copied from the source image. This process avoids unnecessarily reshaping the input array and should therefore also be considerably more efficient than reshaping to a stack of tiled frames before compressing/storing.TILED_FULL
dimension organization type upon construction. The user can specify thedimension_organization_type
asTILED_FULL
and, if the requirements are met for this to be possible, the segmentation will be stored in this way. Note that I anticipate this primarily being used with the aforementionedtile_input_array
option, though that is not a requirement. The previous behaviour was to useTILED_SPARSE
(although this was not actually stored in the dataset) and store the frames in an order that was different to that required forTILED_FULL
. Given comments in @hackermd 's existing code, I believe this was a mistake and that the intention was that frames should always have been stored in an order that matchesTILED_FULL
. Therefore I chose to change the order of frames for all segmentations to match that required byTILED_FULL
regardless of whether the segmentation is actually stored asTILED_FULL
. I suppose you could consider this a backwards incompatible change, although hopefully no-one was relying on the frame ordering in a nonTILED_FULL
. Also note that a clarification of the standard regarding where segment number should live in theTILED_FULL
frame order has been prepared by @dclunie and I have followed this convention (segment number varies more slowly than row and column position).get_total_pixel_matrix
to the segmentation object. This returns the segmentation array of a segmentation object as the total pixel matrix rather than a stack of frames. It works for both TILED_FULL and TILED_SPARSE (or unspecified dimension organization type). Furthermore, the user can specify a rectangular sub-region of the total pixel matrix rather than the full array. Otherwise, behaviour is similar to the existing methods for accessing the segmentation mask (the user may specify segments of interest and use thecombine_segments
option to have a labelmap style array returned to them). I implemented this by generalizing the underlying efficient segmentation "reconstruction" routine so most of the code is shared with the implementation of the existing methods.utils
module. Previously thecompute_plane_position_slide_per_frame
function returned a list ofPlanePositionObjects
but for the tasks above it is inefficient to do this as we just want the underlying values stored in the objects. Therefore that method has been refactored to call another public function (iter_tiled_full_frame_data
) that just yields the underlying data (plane positions and x/y/z offsets).Note that currently this PR contains all of the contents of #245 as I branched from there.
@fedorov @dclunie @DanielaSchacherer @curtislisle