-
-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for blocked and parallel reprojection in reproject_interp
#214
Conversation
Thanks for working on this! I've been taking a look at it and thinking about what would make sense from the point of view of the user API - in particular I think that we could simplify things a little if we don't expect this function to be called by the user directly, but having the user set e.g.
we would check if |
Scaffold seamless wrapper insertion in reproj_interpolate as per pull request astropy#214
I'm happy to look at it, just making sure I understand correctly, you mean doing something like this:
|
Yes exactly! |
Okay that let me clean up the blocked_reproject function but there are still some issues where the interfaces for core functions don't quote match up, for instance both adaptive and interp have the I've just done it for reproject interp so far, and it could be done in the same way for adaptive and exact, but then there would still be issue of functions not quite matching up. You're probably better equipped than I am to suggest a solution, but I'm happy to implement it. |
@astrofrog any thoughts? It's cleaned up the issue a bit but not entirely (if the functions took **kwargs maybe it could be done nicer?) |
@AlistairSymonds - I'll look at this shortly, I'm very sorry for the delay! |
Codecov Report
@@ Coverage Diff @@
## main #214 +/- ##
==========================================
+ Coverage 94.34% 94.69% +0.35%
==========================================
Files 23 23
Lines 725 792 +67
==========================================
+ Hits 684 750 +66
- Misses 41 42 +1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Hey @astrofrog I've found some free time up my sleeve and a need to reproject a large number of images. So I've gone through and cleaned up some of the issues left dangling so all the defaults tests are passing. There is two things remaining I think though,
I've got a few other ideas in the back of my mind that should really help speed up reprojecting large areas of the sky at high resolutions like culling empty tiles, but its the kind of thing that can probably wait until after this PR :) |
Apologies seems I was misunderstanding the AzurePipe reports, and they indeed ran and passed when I properly update my local tree to merge in latest from this repo. I also only just found the astropy top level contributing guide, its highly likely it's got plenty of suggested changes too. (might be worth having a contributing file in this repo that just points there for dummies like me who aren't perfectly across astropy?) |
The CI is a little busted right now. It is unfortunate that this PR has been open for so long. We apologize for any inconvenience caused. Hopefully @astrofrog will come back to reviewing this soon... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two really minor comments but this looks great to me. I'm testing it out on some cubes now.
# if parallel is set but block size isn't, we'll choose | ||
# block size so each thread gets one block each | ||
if parallel is not False and block_size is None: | ||
block_size = [dim // os.cpu_count() for dim in shape_out] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code block doesn't match the comment above, right? If we have 4 CPUs, say, and a 2D image, block_size
will be 1/16th the image size, so each CPU will get 4 blocks. This isn't necessarily a bad thing, but it's also not obviously the right default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes you're dead right, and its probably worth doing big long strips that match the fits memory layout anyway.
proc_pool = None | ||
blocks_futures = [] | ||
|
||
if parallel or type(parallel) is int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want and
here? Otherwise parallel=0
will try to work with 0 workers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might need tightening up, I was imagining parellel=True as being let reproject auto determine threading, but as you say zero would lead to issues.
One more aside - it would be nice to add some sort of progressbar support, but that can be a different Issue. |
Also just quickly while there's more people looking at this, I'm gonna go over @keflavich's comments and make some updates if no one says otherwise. Generally if anyone has feedback on more higher level python/astropy ways of doing things, I'll take it, I'm someone who's written a fair bit of python at various points but main wheel house is verilog and HW design - so I've probably missed something obvious. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlistairSymonds - this is looking good so far! Please do go ahead and rebase (to hopefully fix the CI) and implement @keflavich's comments.
The only comment I have for now is a general one, which is that I think it would be good to change the terminology from 'block' to 'chunk' to match dask terminology - as I think we might want to add support for dask anyway in reproject in future (for input/output arrays) and it would be confusing to have two different terms.
I have more time available to work now so I will be more responsive here now and would like to make sure we can include this in the next release! 😄
Actually thinking about this a bit more, maybe keep it as 'block' for now and I can try and add full dask support as a follow-up PR and I will see if it actually makes sense to use one term over the other? (Dask itself uses both terms) |
Yeah I've got no strong attachment to any term, there's lots of options that are as equally overloaded as each other it seems - tiles, chunks, block, bins. And frick I've completely beans'd up the rebase, I still had a master branch laying about in my fork instead of main and its all gone awry - gonna fix that up then address the comments for realsies this time instead of just talking about it. |
Scaffold seamless wrapper insertion in reproj_interpolate as per pull request astropy#214
eb97956
to
fa384e4
Compare
Okay as nice as it would be to have the original 3yr old pull request be accepted the git history is turning into a nightmare - when I have time on the weekend I'm going to make a new devbranch then manually port across. (Is it obvious I've only used perforce professionally? :P ) quick edit: I think I've fixed most of the mess - commit log mess should all still be able to be squash-ed away before merging too right? |
Quick and dirty implementation of blocked reproject_interp - issue astropy#37 Revert to original reproject source Blocked wrapper that works with non-Healpix reprojection functions broken out reprojrection from block generation in prep for multiprocessing Completed functionality of block reproject wrapper for non-healpix methods Formatting changes to match original source Added memory usage instrumentation Fix memory leak from storing futures in multiprocessing option Fixed process pool args parsing and switched to dicts Removed test code from blocked helper func Scaffold seamless wrapper insertion in reproj_interpolate as per pull request astropy#214 Remove errorenously added testing script Integrated blocked reprojection in interpolate function Removed profiling imports from utils Formatting fixes Formatting fixes PEP8 tox edition Fixes for the blocked to match non-blocked behaviour Fixes for wcsapi input testcases Fix WCS slicing axis incorrectly swapped Add naive tests for blocked and parallel reproj codestyle fixes for blocked test Fix issues blocked reprojection with footprint Style fixes for return fp blocked fixes Update blocked corner case test to be more useful Revert "Squashed commit of the following:" This reverts commit f554ce9. Revert "Revert "Squashed commit of the following:"" This reverts commit fa384e4. Manually re-add blocked code to reproj interp Manually fix up blocked tests Fix blocked tests to use get_pkg_data function Fix codestyle issues Address core comments made by @keflavich
e7f9e4a
to
37b8742
Compare
@astrofrog I'd say its 99% clean, the only issue left seems to be from the testdata being used when I run with the oldestdeps config. It seems something to do with (un)pickling the WCS causing a FITSFixedWarning in unpickling when returning from the multiprocessing call of _block() causing pytest to fall apart? The only thing I've found otherwise is the following two pull requests which came from it being identified as a bug here: astropy/astropy#12834 So indeed the lines causing the warning have been fixed in later astropy versions, and when running the test code outside of we just get warnings but the np assert doesn't fire, eg:
Is this something worth working around or just leave as is since a newer version of astropy has fixed it? |
Since it has been fixed in astropy already, you could use |
# https://github.com/astropy/astropy/pull/12844 | ||
# All the warning code should be removed when old version no longer being used | ||
import warnings | ||
warnings.simplefilter('ignore', category=FITSFixedWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this ignore it for the whole test session? Would a context manager be safer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick test by just adding the following function to the bottom of the test file and it still failed correctly - so no doesn't seem to affect the whole pytest session.
def test_fitswarning():
raise FITSFixedWarning
I assume each test gets its own new invocation of the python interepter so it works but couldn't find any conrete reference to that implementation detail or if it should be relied upon.
So I do agree it would be better if a with warnings.catch_warnings()
manager was used, since then it could be used just wrap function call with the new changes - I've done this locally and just running some checks
I think that's everything now? Once this is in I'm also happy to take a look at doing a similar thing for the adaptive and exact methods too - just tackling one problem at a time :D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\
continuation is not recommended, I don't think.
If not none, a blocked projection will be performed where the output space is | ||
reprojected to one block at a time, this is useful for memory limited scenarios | ||
such as dealing with very large arrays or high resolution output spaces. | ||
parallel : bool or int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This input needs some explanation. Only via code diving do I know what setting to int really mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go over this another time/recomment etc - I was trying to emulate the parallel argument already used in the reproject_exact() function but it's definitely been a while since I originally wrote and looked closely at this.
if parallel is not False and block_size is None: | ||
block_size = shape_out.copy() | ||
# each thread gets an equal sized strip of output area to process | ||
block_size[0] = shape_out[0] // os.cpu_count() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't block size be calculated by the number of actual requested cores?
And parallel=1 should equal parallel=False? I don't see that being handled here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Block size is controllable separately to allow for memory usage tuning, eg if you're reprojecting a field 10s of degrees wide to a 0.5"/px scale that just blows up memory usage. This was my original usecase for doing this blocked functionality - everything possible to stop loading the entire input and output arrays into memory at once.
Additonally there's also some existing code in the moasicking reproject_and_coadd() function that does some culling on a full input field granularity - I'm not sure if it's actually robust with all distortions/projections but ideally a similar sort of empty block culling could be used in future here too allowing for more tuning.
One thing I've noticed that I've missed stating is that the block size is in terms of pixels in the output space, will add that.
Happy to change the parallel=1 case too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of me thinks parallel=1 should actually use the parallel infrastructure rather with one worker rather than be the same as False - as it could be useful for debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my original thought too - or when someone wants the maths to happen in another process so they could do GIL shennigans in the main thread or something? Not super attached either way and not a big change to do, I've got tomorrow off of work so can address it along with the other stuff Lim mentioned if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok - I would suggest leaving the current behavior as-is.
@AlistairSymonds - if you can update the docstrings as @pllim mentioned and avoid the use of the continuation character, I'll go ahead and merge this. Thanks! |
reproject_interp
Quick and dirty implementation of blocked reproject_interp - issue astropy#37 Revert to original reproject source Blocked wrapper that works with non-Healpix reprojection functions broken out reprojrection from block generation in prep for multiprocessing Completed functionality of block reproject wrapper for non-healpix methods Formatting changes to match original source Added memory usage instrumentation Fix memory leak from storing futures in multiprocessing option Fixed process pool args parsing and switched to dicts Removed test code from blocked helper func Scaffold seamless wrapper insertion in reproj_interpolate as per pull request astropy#214 Remove errorenously added testing script Integrated blocked reprojection in interpolate function Removed profiling imports from utils Formatting fixes Formatting fixes PEP8 tox edition Fixes for the blocked to match non-blocked behaviour Fixes for wcsapi input testcases Fix WCS slicing axis incorrectly swapped Add naive tests for blocked and parallel reproj codestyle fixes for blocked test Fix issues blocked reprojection with footprint Style fixes for return fp blocked fixes Update blocked corner case test to be more useful Revert "Squashed commit of the following:" This reverts commit f554ce9. Revert "Revert "Squashed commit of the following:"" This reverts commit fa384e4. Manually re-add blocked code to reproj interp Manually fix up blocked tests Fix blocked tests to use get_pkg_data function Fix codestyle issues Address core comments made by @keflavich
07d1d9d
to
85c9ffb
Compare
I've rebased this and will merge if the CI passes - thanks! |
@AlistairSymonds - thanks for your work on this and sorry it took so long to merge! |
Cheers! Glad it happened in the end - I'll take a look at doing the same for adaptive and exact methods now a base has been put in. |
Performing reprojection of any non-healpix type by breaking up the out space into blocks and iterating over each block to perform reprojection into that specific block of area. This means a large output space can be used whilst only having to load one block into memory at a time.
Additionally these output blocks can be processed in parallel to achieve a speed up of any function.
As mentioned there is still some debug code in there which needs to be cleaned up (commented print statement, imports used for profiling), but the functionality is there.