Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xee does not provide correct data for resampled image #145

Closed
deepgabani8 opened this issue Feb 26, 2024 · 6 comments
Closed

Xee does not provide correct data for resampled image #145

deepgabani8 opened this issue Feb 26, 2024 · 6 comments
Assignees
Labels

Comments

@deepgabani8
Copy link

When I use ee.data.computePixels, it returns different data for different resample methods, which it should but fetching data using xee as backend returns the same data.

Using ee.data.computePixels, which gives different data for resample methods = ['bilinear', 'bicubic'].

image = ee.Image(image_id)
image = image.resample('bilinear').reproject(crs=crs, crsTransform=crs_transform)
image = image.clip(
    ee.Geometry.Rectangle(
      [[-180, -90], [-90, -45]],
      None,
      geodesic=False,
  )
)

data = np.load(io.BytesIO(ee.data.computePixels({
    'expression': image,
    'fileFormat': 'NPY',
})))

Using Xee as the backend, which gives the same data for resample methods = ['bilinear', 'bicubic'].

image = ee.Image(image_id)
image = image.resample('bilinear').reproject(crs=crs, crsTransform=crs_transform)
geom = ee.Geometry.Rectangle(
    [[-180, -90], [-90, -45]],
    None,
    geodesic=False,
)

ds = xr.open_dataset(
  ee.ImageCollection([image]), projection = image.projection(), geometry=geom,
  engine=xee.EarthEngineBackendEntrypoint,
)

var = list(ds.data_vars)[0]
data = ds[var].data[0,:,:].T
@mahrsee1997
Copy link
Collaborator

Thanks for raising the issue, @deepgabani8.

Just for your information (jfyi), we internally work with asset_ids, so any manipulation made on images like these will not affect the data fetched from EE.

@mahrsee1997
Copy link
Collaborator

mahrsee1997 commented Feb 29, 2024

Actually, we have a bug/issue in the codebase: https://github.com/google/Xee/blob/main/xee/ext.py#L789C1-L797C80.

We consider 'asset_ids' at the time of processing, if we can obtain them. And we implemented this to avoid an expensive toList() operation. But it seems like it will cause issue like in your case.

@naschmitz naschmitz added the P1 label Mar 5, 2024
copybara-service bot pushed a commit that referenced this issue Apr 11, 2024
Add a new `fast_time_slicing` parameter. If True, Xee performs an optimization that makes slicing an ImageCollection across time faster. This optimization loads EE images in a slice by ID, so any modifications to images in a computed ImageCollection will not be reflected.

For those familiar with the code before, the else flow in `_slice_collection` was only entered when images in the collection didn't have IDs. Clearing the image IDs triggered the else block.

Also adds several new warnings:

- if a user enables `fast_time_slicing` but there are no image IDs, and
- if a user is indexing into a very large ImageCollection.

Fixes #88 and #145.

PiperOrigin-RevId: 623280839
copybara-service bot pushed a commit that referenced this issue Apr 11, 2024
Add a new `fast_time_slicing` parameter. If True, Xee performs an optimization that makes slicing an ImageCollection across time faster. This optimization loads EE images in a slice by ID, so any modifications to images in a computed ImageCollection will not be reflected.

For those familiar with the code before, the else flow in `_slice_collection` was only entered when images in the collection didn't have IDs. Clearing the image IDs triggered the else block.

Also adds several new warnings:

- if a user enables `fast_time_slicing` but there are no image IDs, and
- if a user is indexing into a very large ImageCollection.

Fixes #88 and #145.

PiperOrigin-RevId: 623815209
@alxmrs
Copy link
Collaborator

alxmrs commented Jun 15, 2024

This discussion on the Pangeo discourse seems really relevant.

https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140

@schwehr @simon Any thoughts on how this might be related? Could the discrepancy eventually cause an issue?

@alxmrs
Copy link
Collaborator

alxmrs commented Oct 12, 2024

I think this PR in Xarray may provide a long term fix for this issue.

pydata/xarray#9543

@tylere tylere self-assigned this Nov 14, 2024
@jdbcode
Copy link
Member

jdbcode commented Nov 14, 2024

#156 may have fixed this – needs verification.

@tylere
Copy link
Collaborator

tylere commented Dec 11, 2024

This colab script shows that Xee now produces different results for different resampling methods ('bilinear' and 'bicubic').
https://colab.research.google.com/gist/tylere/3d0548e28d8458b71142007157f8c539/template-xee-issue.ipynb
Given this, I'm closing this issue.

@tylere tylere closed this as completed Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants