Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor adjustments and bug fixing of resampling_in_space #1100

Merged

Conversation

konstntokas
Copy link
Contributor

@konstntokas konstntokas commented Jan 3, 2025

This PR is related to the previous PR #1098. Further I enhanced the testing of resample_in_space to cover the case of a reprojection, where a subsetting in rectify_dataset is performed.

Checklist:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/source/*
  • Changes documented in CHANGES.md
  • GitHub CI passes
  • AppVeyor CI passes
  • Test coverage remains or increases (target 100%)

@konstntokas konstntokas changed the title Konstntokas xxx minor adjustments resampling in space Minor adjustments and bug fixing of resampling_in_space Jan 3, 2025
@konstntokas konstntokas force-pushed the konstntokas-xxx-minor_adjustments_resampling_in_space branch from 541df45 to a97ab84 Compare January 13, 2025 08:44
@konstntokas konstntokas requested a review from TonioF January 13, 2025 08:56
Copy link

codecov bot commented Jan 13, 2025

Codecov Report

Attention: Patch coverage is 93.75000% with 1 line in your changes missing coverage. Please review.

Project coverage is 89.59%. Comparing base (25fdf9f) to head (2855d91).
Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
xcube/core/resampling/spatial.py 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1100      +/-   ##
==========================================
+ Coverage   89.58%   89.59%   +0.01%     
==========================================
  Files         277      277              
  Lines       21836    21852      +16     
==========================================
+ Hits        19561    19579      +18     
+ Misses       2275     2273       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 141 to 152
# Handles the case where 2D coordinates are already present in the dataset,
# which typically occurs after a reprojection performed by the
# `resample_in_space` function.
if "transformed_x" and "transformed_y" in source_ds_subset:
source_gm = GridMapping.from_coords(
source_ds_subset.transformed_x,
source_ds_subset.transformed_y,
crs=source_gm.crs,
tile_size=source_gm.tile_size,
)
else:
source_gm = GridMapping.from_dataset(source_ds_subset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but I have the feeling this is a workaround. As the coordinates are included in source_ds, it should be possible to call

GridMapping.from_dataset(source_ds_subset, prefer_crs=source_gm.crs)

and get the correct result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the setup up the source_ds in resample_in_space. I took Sentinel-3 data as inspiration (see notebook https://github.com/xcube-dev/xcube/blob/main/examples/notebooks/resampling/rectify_dataset.ipynb).

@konstntokas konstntokas requested review from TonioF and forman January 13, 2025 15:47
Copy link
Contributor

@TonioF TonioF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new solution would cause problems with data variables that have a gridmapping attribute that points to a crs variable which might not exist anymore after these changes. I still prefer the solution of a variable "transformed_spatial_ref". However, if you prefer your way, please make sure the attributes are not wrong.

@konstntokas
Copy link
Contributor Author

konstntokas commented Jan 14, 2025

The new solution would cause problems with data variables that have a gridmapping attribute that points to a crs variable which might not exist anymore after these changes. I still prefer the solution of a variable "transformed_spatial_ref". However, if you prefer your way, please make sure the attributes are not wrong.

Thank you that you are looking into it so carefully!

I need to remove the original coordinates, because if the source ds has x and y, then either x and y or transformed_x and transformed_y will be added to the projected_coords in the function xcube.core.gridmapping.cfconv.get_dataset_grid_mapping_proxis, depending if x or transformed_x (same for y) comes first in the list potential_coord_vars. I therefore prefer to setup a clean dataset in resample_in_space, where only one set of coordinates can be found.

I thus would like to keep my suggestion. Further, I delete all grid_mappings from the data variables in the source_ds. I also added a grid_mapping attribute to the unittest.

@konstntokas konstntokas requested a review from TonioF January 14, 2025 16:29
@TonioF
Copy link
Contributor

TonioF commented Jan 14, 2025

I need to remove the original coordinates, because if the source ds has x and y, then either x and y or transformed_x and transformed_y will be added to the projected_coords in the function xcube.core.gridmapping.cfconv.get_dataset_grid_mapping_proxis, depending if x or transformed_x (same for y) comes first in the list potential_coord_vars. I therefore prefer to setup a clean dataset in resample_in_space, where only one set of coordinates can be found.

Do you really have to? For testing, I kept existing variables crs and spatial_ref and added a new one transformed_spatial_ref. Now, even though the source_gm retrieved from GridMapping.from_dataset() was the UTM one, the tests eventually succeeded.

@konstntokas
Copy link
Contributor Author

I need to remove the original coordinates, because if the source ds has x and y, then either x and y or transformed_x and transformed_y will be added to the projected_coords in the function xcube.core.gridmapping.cfconv.get_dataset_grid_mapping_proxis, depending if x or transformed_x (same for y) comes first in the list potential_coord_vars. I therefore prefer to setup a clean dataset in resample_in_space, where only one set of coordinates can be found.

Do you really have to? For testing, I kept existing variables crs and spatial_ref and added a new one transformed_spatial_ref. Now, even though the source_gm retrieved from GridMapping.from_dataset() was the UTM one, the tests eventually succeeded.

I changed it to transformed_spatial_ref without deleting the previous crs or spatail_ref variable/coordinate. The the CI fails in the newly added test, if actual subsetting is performed, see second test in test_spatial.test_reproject_utm().

@TonioF
Copy link
Contributor

TonioF commented Jan 15, 2025

You're right, when I executed this, I had left in the line where x and y were removed from the dataset. I now would accept your solution from before the last two commits. It's still not perfect, but I wouldn't be able to come up with a better solution.

@konstntokas konstntokas force-pushed the konstntokas-xxx-minor_adjustments_resampling_in_space branch from 0ab6d78 to 61fbcda Compare January 15, 2025 16:08
@konstntokas
Copy link
Contributor Author

@forman can you have a look onto our discussion and my suggested changes. I think a second opinion is useful here. Thank you!

CHANGES.md Outdated Show resolved Hide resolved
Comment on lines 212 to 215
if "crs" in source_ds:
source_ds = source_ds.drop_vars("crs")
if "spatial_ref" in source_ds:
source_ds = source_ds.drop_vars("spatial_ref")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be right in cases. You should just drop the grid mapping variable, no matter how it is named. crs and spatial_ref are common names, but their names are not a sufficient condition to identify the grid mapping variable. There can even be more than one such variables if multiple CRSes are used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now go through the grid_mapping in the data variable attributes, and not only delete the attribute with key key grid_mapping, but also drop the grid mapping variable. Afterwards I still check for the grid mapping variable named with crs and spatial_ref. For Sentinel 2 jp2 files e.g. the dataset has the coordinate spatial_ref, but the data variable has no attribute grid_mapping.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats good.

@@ -72,7 +72,6 @@ def get_dataset_grid_mapping_proxies(
grid_mapping_proxies: dict[Union[Hashable, None], GridMappingProxy] = dict()

# Find any grid mapping variables by CF 'grid_mapping' attribute
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these spacers so the comment is not glued to the next line, but still connected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference here. I changed it back. :)

konstntokas and others added 3 commits January 16, 2025 15:52
Co-authored-by: Norman Fomferra <norman.fomferra@brockmann-consult.de>
Copy link
Member

@forman forman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TonioF @konstntokas

I approve unless I overlooked the place where you added transformed_spatial_ref. I'd be against it. Instead replace the existing source grid mapping from. If you need a history, place a record into the global history attribute.

@konstntokas konstntokas merged commit f1654b7 into main Jan 17, 2025
5 checks passed
@konstntokas konstntokas deleted the konstntokas-xxx-minor_adjustments_resampling_in_space branch January 17, 2025 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants