-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable opening datasets with gcps even if no valid crs is present #182
base: main
Are you sure you want to change the base?
Conversation
Some datasets like Sentinel-1 GRD files don't have a coordinate reference system (crs), but they may have ground control points (gcps) and can still be opened by rasterio.
Probably should write a regression test. @scottyhq, any good sample Sentinel-1 GRD files with a stable link that you know of? |
Need to reproject from a proper source coordinate reference system (src_crs) instead of arbitrarily.
GRDs are really big. Or rioxarray just constructs a synthetic dataset with GCPs for a test: https://github.com/corteva/rioxarray/blob/bfd616594bb82886133b5f4b9356ed16d39014fc/test/integration/test_integration_rioxarray.py#L1053 |
Not very well written, but it's a start I guess. Synthetic GeoTIFF generation code modified from https://github.com/corteva/rioxarray/blob/bfd616594bb82886133b5f4b9356ed16d39014fc/test/integration/test_integration_rioxarray.py#L1053.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a bunch for working on this @weiji14, and thanks for the test too!
Put the src_crs in the vrt_params dict to avoid an elif statement.
np.testing.assert_allclose( | ||
actual=array, desired=np.array([[0.0, 0.0], [0.0, 0.0]], dtype=np.float32) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a bunch for working on this @weiji14, and thanks for the test too!
Don't thank me yet, this is a really bad test 😅 It only checks that things can run (and that a GeoTIFF with gcps works without an error message), but I don't like this assert statement... Any pointers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, maybe we could read it back using rasterio with (what we think are) the right parameters, and check that the results match?
Only other thing I can think of would be to write some data into the file (probably like np.linspace(...).reshape(...)
). Then we could make a more meaningful assertion about the values we get out.
@weiji14 did you get any further with this? Is this actually working as is now? |
Technically, I think the implementation is sound. But the unit test could be much improved. Ok if you want to merge this first and improve the unit test at a later date. |
It's not a good idea in dask to mutate inputs. This object could be shared with other tasks. Also, I think we just need to check `ds.crs is None`—if `ds.crs` is a CRS, it will always have a `to_epsg` method. (Though that could fail, so I've added handling for that case too.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.testing.assert_allclose( | ||
actual=array, desired=np.array([[0.0, 0.0], [0.0, 0.0]], dtype=np.float32) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, maybe we could read it back using rasterio with (what we think are) the right parameters, and check that the results match?
Only other thing I can think of would be to write some data into the file (probably like np.linspace(...).reshape(...)
). Then we could make a more meaningful assertion about the values we get out.
The test is still failing for me though.
I'm not seeing any new commits in this PR. Could you try again?
If I recall correctly, there's actually 2 things needed to fix the purple/green/yellow output.
I'm having trouble with the second part (resampling), which would require someone a bit smarter on the SAR data structure to get right (e.g. @scottyhq). I wish I had a proper end-to-end validated example where someone has taken a Sentinel-1 GRD image (from Microsoft Planetary Computer), read it properly with |
this also gets the test passing
Oops maybe I forgot to push. I see them now in https://github.com/gjoseph92/stackstac/pull/182/files/1b3c47c6bd0d3ec9dd17b25dc8cb1c8496a131a7..d7b08203ab754295bf5644fa0a661df9fa94e8b9. I'd be a little surprised to find that it's just a resampling issue. As long as the data types are right and we're not truncating floats to ints or something, I don't quite see why a different resampling method would change meaningful data into something completely meaningless-looking. Furthermore, if you look at the example scene from #181 in PC explorer, you can see that what we're seeing from I guess where I'm leaning right now is that if we can't get this to seem to work properly and return meaningful data, I'd rather raise a |
if self.spec.vrt_params != { | ||
"crs": ds.crs.to_epsg(), | ||
if ds_epsg is None or self.spec.vrt_params != { | ||
"crs": ds_epsg, | ||
"transform": ds.transform, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MIght need to set src_transform
in addition to transform
as mentioned by @vincentsarago in #181 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So looking at the code it might be a bit more complex.
What I do in rio-tiler, is automatically create a WarpedVRT when we have gcps (which in stackstac case should replace the ds
) and then I create another WarpedVRT on top of it if I need warping
Some datasets like Sentinel-1 GRD files don't have a coordinate reference system (crs), but they may have ground control points (gcps) and can still be opened by rasterio.
Test this branch using
pip install git+https://github.com/weiji14/stackstac.git@open_vrt_with_gcps
Fixes #181.