-
-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify blocked reprojection implementation by using dask and improve efficiency of parallel reprojection #314
Merged
+148
−217
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
7e2faee
Started refactoring reproject_blocked to use dask
astrofrog 37ca966
Tweaks
astrofrog ad1ee0d
Added dask[array] as a reproject dependency
astrofrog 1465af4
Increase small block size to speed up test, and remove test which is …
astrofrog bc7dda7
Fix tests
astrofrog 6cb31fd
Remove code to determine chunk size, instead leave this up to dask
astrofrog a7b2067
Fix more issues
astrofrog bdb7d0b
Work around issue with da.store() for now
astrofrog 60fbe20
Added note
astrofrog f5a884d
Simplify logic
astrofrog 07d9806
Bump minimum required version of numpy to follow NEP 29, bump minimum…
astrofrog 7d3492b
Added cloudpickle as a dependency
astrofrog 1f5efba
Fixed syntax in setup.cfg
astrofrog de94742
Fix blocked reprojection with extra broadcast dimensions
astrofrog 9513323
Expand test suite and fix all tests
astrofrog d40768c
Recognize number of workers
astrofrog 669741e
Pass input array to processes via memmap
astrofrog 76f4d7d
Clean up code
astrofrog 8704d77
Add zarr and fsspec to dependencies
astrofrog 13cc3d2
Make reproject_blocked private
astrofrog 8615e48
Fix codestyle
astrofrog ad5c932
Fix typo and remove unused fixture
astrofrog File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -718,9 +718,11 @@ def test_blocked_broadcast_reprojection(input_extra_dims, output_shape, parallel | |
|
||
|
||
@pytest.mark.parametrize("parallel", [True, 2, False]) | ||
@pytest.mark.parametrize("block_size", [[40, 40], [500, 500], [500, 100], None]) | ||
@pytest.mark.parametrize("block_size", [[500, 500], [500, 100], None]) | ||
@pytest.mark.parametrize("return_footprint", [False, True]) | ||
@pytest.mark.parametrize("existing_outputs", [False, True]) | ||
@pytest.mark.remote_data | ||
def test_blocked_against_single(parallel, block_size): | ||
def test_blocked_against_single(parallel, block_size, return_footprint, existing_outputs): | ||
# Ensure when we break a reprojection down into multiple discrete blocks | ||
# it has the same result as if all pixels where reprejcted at once | ||
|
||
|
@@ -729,6 +731,19 @@ def test_blocked_against_single(parallel, block_size): | |
array_test = None | ||
footprint_test = None | ||
|
||
shape_out = (720, 721) | ||
|
||
if existing_outputs: | ||
output_array_test = np.zeros(shape_out) | ||
output_footprint_test = np.zeros(shape_out) | ||
output_array_reference = np.zeros(shape_out) | ||
output_footprint_reference = np.zeros(shape_out) | ||
else: | ||
output_array_test = None | ||
output_footprint_test = None | ||
output_array_reference = None | ||
output_footprint_reference = None | ||
|
||
# the warning import and ignore is needed to keep pytest happy when running with | ||
# older versions of astropy which don't have this fix: | ||
# https://github.com/astropy/astropy/pull/12844 | ||
|
@@ -738,72 +753,40 @@ def test_blocked_against_single(parallel, block_size): | |
|
||
with warnings.catch_warnings(): | ||
warnings.simplefilter("ignore", category=FITSFixedWarning) | ||
|
||
# this one is needed to avoid the following warning from when the np.as_strided() is | ||
# called in wcs_utils.unbroadcast(), only shows up with py3.8, numpy1.17, astropy 4.0.*: | ||
# DeprecationWarning: Numpy has detected that you (may be) writing to an array with | ||
# overlapping memory from np.broadcast_arrays. If this is intentional | ||
# set the WRITEABLE flag True or make a copy immediately before writing. | ||
# We do call as_strided with writeable=True as it recommends and only shows up with the 10px | ||
# testcase so assuming a numpy bug in the detection code which was fixed in later version. | ||
# The pixel values all still match in the end, only shows up due to pytest clearing | ||
# the standard python warning filters by default and failing as the warnings are now | ||
# treated as the exceptions they're implemented on | ||
if block_size == [10, 10]: | ||
warnings.simplefilter("ignore", category=DeprecationWarning) | ||
|
||
array_test, footprint_test = reproject_interp( | ||
hdu2, hdu1.header, parallel=parallel, block_size=block_size | ||
result_test = reproject_interp( | ||
hdu2, | ||
hdu1.header, | ||
parallel=parallel, | ||
block_size=block_size, | ||
return_footprint=return_footprint, | ||
output_array=output_array_test, | ||
output_footprint=output_footprint_test, | ||
) | ||
|
||
array_reference, footprint_reference = reproject_interp( | ||
hdu2, hdu1.header, parallel=False, block_size=None | ||
result_reference = reproject_interp( | ||
hdu2, | ||
hdu1.header, | ||
parallel=False, | ||
block_size=None, | ||
return_footprint=return_footprint, | ||
output_array=output_array_reference, | ||
output_footprint=output_footprint_reference, | ||
) | ||
|
||
np.testing.assert_allclose(array_test, array_reference, equal_nan=True) | ||
np.testing.assert_allclose(footprint_test, footprint_reference, equal_nan=True) | ||
|
||
|
||
@pytest.mark.remote_data | ||
def test_blocked_corner_cases(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This test is no longer relevant if we don't try and set the chunk size ourselves. |
||
""" | ||
When doing blocked there are a few checks designed to sanity clamp/preserve | ||
values. Even though the blocking process only tiles in a 2d manner 3d information | ||
about the image needs to be preserved and transformed correctly. Additonally | ||
when automatically determining block size based on CPU cores zeros can appear on | ||
machines where num_cores > x or y dim of output image. So make sure it correctly | ||
functions when 0 block size goes in | ||
""" | ||
|
||
# Read in the input cube | ||
hdu_in = fits.open(get_pkg_data_filename("data/equatorial_3d.fits", package="reproject.tests"))[ | ||
0 | ||
] | ||
|
||
# Define the output header - this should be the same for all versions of | ||
# this test to make sure we can use a single reference file. | ||
header_out = hdu_in.header.copy() | ||
header_out["NAXIS1"] = 10 | ||
header_out["NAXIS2"] = 9 | ||
header_out["CTYPE1"] = "GLON-SIN" | ||
header_out["CTYPE2"] = "GLAT-SIN" | ||
header_out["CRVAL1"] = 163.16724 | ||
header_out["CRVAL2"] = -15.777405 | ||
header_out["CRPIX1"] = 6 | ||
header_out["CRPIX2"] = 5 | ||
|
||
array_reference = reproject_interp(hdu_in, header_out, return_footprint=False) | ||
|
||
array_test = None | ||
|
||
# same reason as test above for FITSFixedWarning | ||
import warnings | ||
|
||
with warnings.catch_warnings(): | ||
warnings.simplefilter("ignore", category=FITSFixedWarning) | ||
if return_footprint: | ||
array_test, footprint_test = result_test | ||
array_reference, footprint_reference = result_reference | ||
else: | ||
array_test = result_test | ||
array_reference = result_reference | ||
|
||
array_test = reproject_interp( | ||
hdu_in, header_out, parallel=True, block_size=[0, 4], return_footprint=False | ||
) | ||
if existing_outputs: | ||
assert array_test is output_array_test | ||
assert array_reference is output_array_reference | ||
if return_footprint: | ||
assert footprint_test is output_footprint_test | ||
assert footprint_reference is output_footprint_reference | ||
|
||
np.testing.assert_allclose(array_test, array_reference, equal_nan=True, verbose=True) | ||
np.testing.assert_allclose(array_test, array_reference, equal_nan=True) | ||
if return_footprint: | ||
np.testing.assert_allclose(footprint_test, footprint_reference, equal_nan=True) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this to instead let dask decide how to chunk the array, though we might want to provide a keyword argument that specifies the typical number of elements in a chunk.