Replies: 4 comments 2 replies
-
This hasn't been looked into too deeply. It may or may not work as you suggest. However, if it were to work, I imagine all code has to be executed inside of the |
Beta Was this translation helpful? Give feedback.
-
Thanks Alan, I think its quite a common pattern with rasterio to do: env = rasterio.Env(
GDAL_HTTP_MERGE_CONSECUTIVE_RANGES="YES",
GDAL_DISABLE_READDIR_ON_OPEN="EMPTY_DIR",
CPL_VSIL_CURL_USE_HEAD=False,
CPL_VSIL_CURL_ALLOWED_EXTENSIONS="TIF",
)
with env:
with rasterio.open(url) as src:
................. It would be nice if this pattern worked consistently in rioxarray. My 3rd from last example is the one that is most counter intuitive in the way it currently functions. I think there is also an argument the the second from last should also work differently. I had a dig around in the code to see what it would take to make it work. if Then it could wrap the I guess a similar approach would also be appropriate in RasterioWriter. |
Beta Was this translation helpful? Give feedback.
-
In I think it is "least surprising" way of handling it, but it does require quite a bit of code, especially for handling AWS credentials https://github.com/opendatacube/odc-stac/blob/develop/odc/stac/_rio.py and there are some gotchas: like you probably don't want to copy For things like AWS credentials, sometimes you want to get credentials in the main process and then copy them over to the cluster (static credentials/assume role), other times you can ONLY obtain credentials when running on the cluster (machine credentials). And I'm sure there are plenty of tuning settings you might want to configure at the "fleet" level rather than worry about them in code. Configuration space can get pretty complex quick. |
Beta Was this translation helpful? Give feedback.
-
Thanks Kirill and Alan, I sunk a bit of time into it and got something sort of working but ive given up for now. I will stick with setting environment variables on the 'fleet' of workers for now as it seems to be the only reliable way to get a gdal configuration to work with rioxarray. |
Beta Was this translation helpful? Give feedback.
-
Im trying to understand how (or if) rioxarray works with rasterio.Env aka gdal configuration.
For demo use im using a vrt that points to a remote tif file
I then set
CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".vrt"
so that the vrt can be read but the tif cant. The idea is that open_rasterio can work on the vrt but when i try to compute its opening the tif.If I get an error then it is clear that the gdal configuration is working.
It seemed like a definitive way to test what what environment is seen on the workers separately to locally. Hopefully that makes sense.
I was expecting (hoping) that using
rioxarray.open
inside arasterio.Env
context would mean that context would magically be used where ever rasterio ended up being called by rioxarray to open that dataset.Im not sure if im doing something wrong or just expecting too much. Any pointers?
Below is a set of examples. They need to each be run separately to avoid gdal caching.
Thanks
Beta Was this translation helpful? Give feedback.
All reactions