Identify corrupt data source in odc.stac.load #97

SandroGroth · 2022-11-24T10:35:44Z

Hi all, first of all thanks for this great tool!

I'm currently trying to aggregate raster values of a xarray.DataSet created with odc-stac based on several hundred STAC items, similar to this reproducible example:

import pystac_client
import planetary_computer
from odc.stac import load

catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1", 
    modifier=planetary_computer.sign_inplace
) 

time_range = "2020-01-01/2020-12-31"
bbox = [-122.2751, 47.5469, -121.9613, 47.7458]

search = catalog.search(collections=["landsat-c2-l2"], bbox=bbox, datetime=time_range)
items = search.get_all_items()

data = load(items, bands=["green"], chunks={"x": 256, "y": 256})
res = data.sum(dim="time").compute()

If, however, one of the many items is corrupt, it is very hard to indentify the faulty data source just from the RasterioIOError that gets returned:

# lets break a random href of an Landsat item
items[0].assets['green'].href = "https://faulty_url"

Excecuting the code above returns:

Exception has occurred: RasterioIOError
HTTP response code: 404

During handling of the above exception, another exception occurred:

  File "/path/to/file", line 16, in <module>
    res = data.sum(dim="time").compute()

Is there an option to extend the logging in odc.stac.load in order to identify which Item/Asset rasterio wasn't able to open?

The text was updated successfully, but these errors were encountered:

Kirill888 · 2022-11-24T11:02:53Z

if this is loading data from planetary computer you need to sign urls for it to work, use patch_url=planetary_computer.sign, https://odc-stac.readthedocs.io/en/latest/notebooks/stac-load-S2-ms.html#Lazy-load-all-the-bands
for understanding what is failing during load probably best to enable logging for rasterio library as this is the library that we use for loading data

SandroGroth · 2022-11-24T13:41:27Z

Thanks for the quick response!

I'm currently working with an internally hosted and maintained STAC catalog, where sometimes hrefs of assets are broken. That's why it would be handy, if in case a RasterioIOError is raised, the error message would include the href of the file that was attempted to open.

As suggested, I tried to get this information with rasterio by catching the RasterioIOError when it comes up and print the filename property of the error:

try:    
    data = load(items, bands=["green"], chunks={"x": 256, "y": 256})
    res = data.sum(dim="time").compute()
except rasterio.errors.RasterioIOError as e:
    print(f"e.filename")

... which unfortunately is None. I looked into the GDAL configuration options, but did not find any logging option that would log the href wihtout producing a ton of messages when opening a bigger list of items.

I guess it would be cool, if odc.stac.load would catch the error as well and additionally log the href that was attempted to open. So in theory something in the direction of:

except rasterio.errors.RasterioIOError as e:
    logger.error(f"Unable to open {asset.href}: \n {e.characters_written}")
    raise e

Kirill888 · 2022-11-24T14:25:15Z

it is None, but exception message should contain file being loaded

import rasterio
import logging

logging.basicConfig(level="INFO")
logging.getLogger("rasterio").setLevel(logging.DEBUG)

try:
    x = rasterio.open("bad_tif.tif")
except rasterio.errors.RasterioIOError as e:
    print(f"{e}, {e.filename}")

odc-stac does not capture any rasterio errors, they all get bubbled up, but we probably should have "continue load even when some files failed to load" mode, with proper error reporting. I also recommend testing things like that without using Dask.

And also 256px chunks are way too tiny, I recommend starting with 2048 and only going down from that in special situations.

Kirill888 · 2022-11-24T14:34:52Z

stac item and asset information are all gone by the time loading is happening inside odc-stac. Pixel reading might be happening on a remote instance (Dask), and stac items can contain a lot of extra metadata, so we distill it down to essential info only, so no way to link it back to a specific stac item.asset at the moment.

SandroGroth · 2022-11-24T16:37:50Z

Got it! I will activate more detailed rasterio logging if odc.stac.load encounters an exception.

Thanks again for the detailed explanation and keep up the great work!

idantene · 2023-07-27T20:34:40Z

@Kirill888 Bringing this up again, (and also in the context of #54).
stackstac offers a solution to this with errors_as_nodata (here).

Any chance to implement something similar in odc-stack?

Kirill888 · 2023-07-28T05:01:11Z

@idantene similar option is available in the current release of odc-stac,

fail_on_error=False,

Failed locations are logged with python warning system, see #100 and #99

idantene · 2023-07-28T06:35:46Z

Thanks! Don't know how I missed that. One thing that's missing is what does it mean to not fail on error? Will the band be there? Will it have some nan values? Or something else?

…

On Fri, Jul 28, 2023, 08:01 Kirill Kouzoubov ***@***.***> wrote: @idantene <https://github.com/idantene> similar option is available in the current release of odc-stac, fail_on_error=False, Failed locations are logged with python warning system, see #99 <#99> — Reply to this email directly, view it on GitHub <#97 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC46YKRBKDKRRGPUD4RQDY3XSNBSFANCNFSM6AAAAAASKGOEI4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Kirill888 · 2023-07-28T08:33:03Z

odc-stac decides on structure of the output array at the very start, so all the storage for all time slices and for all the bands is "allocated" at the very start (not really when using Dask, but same idea). Missing/broken file will result in nodata or nan pixels being filled in there. Rather, image begins with empty pixels only, then each item contributes it's valid pixels (image might be readable but have no valid data at all, only nodata/nan pixels). In case of overlapping data, "first valid pixel sticks".

so if your problem is due to broken network connection for example, then you will get back array full of nodata/nan pixels.

Kirill888 · 2023-07-28T08:45:22Z

There is no way to distinguish between pixels that were missing from the original data and pixels that failed to read, both types end up with nodata marker. There is no mask of "observed pixels" being computed either, so we can't distinguish between the following types of missing data

No source image overlaps this pixel
Some source images overlap this pixel but have no valid pixel at that location
Some source images overlap this pixel, but we don't know if any of them had any valid data here because we failed to read them

idantene · 2023-07-28T09:24:46Z

Thanks @Kirill888, all of that makes a lot of sense! I wish it was more explicitly mentioned in the documentation, and I still hope to put in a PR for documentation in the future :)

SandroGroth closed this as completed Nov 24, 2022

Kirill888 mentioned this issue Dec 8, 2022

Feature: report IO failures in a pogrammatic fashion #101

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify corrupt data source in odc.stac.load #97

Identify corrupt data source in odc.stac.load #97

SandroGroth commented Nov 24, 2022

Kirill888 commented Nov 24, 2022

SandroGroth commented Nov 24, 2022

Kirill888 commented Nov 24, 2022

Kirill888 commented Nov 24, 2022

SandroGroth commented Nov 24, 2022

idantene commented Jul 27, 2023

Kirill888 commented Jul 28, 2023 •

edited

Loading

idantene commented Jul 28, 2023 via email

Kirill888 commented Jul 28, 2023

Kirill888 commented Jul 28, 2023

idantene commented Jul 28, 2023

Identify corrupt data source in odc.stac.load #97

Identify corrupt data source in odc.stac.load #97

Comments

SandroGroth commented Nov 24, 2022

Kirill888 commented Nov 24, 2022

SandroGroth commented Nov 24, 2022

Kirill888 commented Nov 24, 2022

Kirill888 commented Nov 24, 2022

SandroGroth commented Nov 24, 2022

idantene commented Jul 27, 2023

Kirill888 commented Jul 28, 2023 • edited Loading

idantene commented Jul 28, 2023 via email

Kirill888 commented Jul 28, 2023

Kirill888 commented Jul 28, 2023

idantene commented Jul 28, 2023

Kirill888 commented Jul 28, 2023 •

edited

Loading