Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example of using vsicurl to GUNW products #27

Closed
dbekaert opened this issue Jun 1, 2019 · 20 comments
Closed

Example of using vsicurl to GUNW products #27

dbekaert opened this issue Jun 1, 2019 · 20 comments
Labels
enhancement New feature or request
Milestone

Comments

@dbekaert
Copy link
Collaborator

dbekaert commented Jun 1, 2019

Adding a small example of using vsi curl to access netcdf products in ASF AWS directly. Providing notes below. Note the ASF instructions are outdated.

@piyushrpt does this work for you?

ASF instructions

From ASF website: Users can obtain a temporary AWS access key from ASF using their Earthdata Login credentials. The temporary key is then used to obtain data while the access key is valid. Temporary access keys are obtained by invoking the following URL: https://grfn.asf.alaska.edu/door/credentials

The "Credentials" block of the response includes an Access Key, Secret Key, and Session Token for a temporary S3 session, along with the session's expiration date. The "PolicyDocument" block describes the AWS permissions granted for the session, including the name of the S3 bucket where Sentinel-1 Interferogram (BETA) products are stored.

Example of a legacy beta product and vsi:

gdalinfo /vsizip/vsis3/grfn-content-prod/S1-IFG_RM_M1S1_TN158_20180422T130337-20180410T130310_s1-resorb-70ec-v1.2.1-standard.zip/S1-IFG_RM_M1S1_TN158_20180422T130337-20180410T130310_s1-resorb-70ec-v1.2.1-standard/merged/filt_topophase.unw.geo.vrt

GDAL installtion notes

A recent enough version (4.5+) of libnetcdf must be used so that /vsicurl/ can be used by the netCDF driver. i.e. you should see 'NetCDF has netcdf_mem.h: yes' in configure output.

Current beta product linls

URL of an existing ASF GUNW product:

https://grfn.asf.alaska.edu/door/download/S1-GUNW-A-R-077-tops-20190222_20190210-231605-42666N_40796N-PP-d75b-v2_0_1.nc
@dbekaert dbekaert added the enhancement New feature or request label Jun 1, 2019
@piyushrpt
Copy link
Collaborator

They seem to have detailed instructions on their page:

  1. Obtain credentials using: https://media.asf.alaska.edu/uploads/InSAR/temporary_security_credentials.py
  • Note the 3 env variables that need to be set at the end of the script.
  • The script only reports the commands, these still need to be executed to set the variables.
  1. Once these variables are set, you should be able to use vsis3 following instructions here:
    https://www.asf.alaska.edu/sar_datasets/sentinel-1-interferograms-beta/command-line-tools/gdal/

I plan to test this out this week.

@piyushrpt
Copy link
Collaborator

piyushrpt commented Jun 3, 2019

Once, I get the AWS credentials with the above script, both these commands work for me:

aws s3 ls s3://grfn-content-prod/S1-GUNW-A-R-124-tops-20190226_20190220-043059-20993N_18920N-PP-3bd6-v2_0_1.nc
aws s3 cp s3://grfn-content-prod/S1-GUNW-A-R-124-tops-20190226_20190220-043059-20993N_18920N-PP-3bd6-v2_0_1.nc .

The AWS mechanism appears to be working pretty well.

For awscli reference - see https://www.asf.alaska.edu/sar_datasets/sentinel-1-interferograms-beta/command-line-tools/aws-cli/

@dbekaert
Copy link
Collaborator Author

dbekaert commented Jun 3, 2019

@piyushrpt
Copy link
Collaborator

You will also have to setup an additional environment variable, if you are already an aws user and have ~/.aws with configurations setup

export AWS_DEFAULT_REGION=us-east-1

@piyushrpt
Copy link
Collaborator

piyushrpt commented Aug 6, 2019

Here is a full example from material we are putting together for UNAVCO short course. Looks like there could an issue with conda's gdal installation that recognizes nc with groups as HDF5 when using vsis3 - might be good to communicate this to ASF. This has to do with detection of netcdf_mem.h when building GDAL.

#This is from https://media.asf.alaska.edu/uploads/InSAR/temporary_security_credentials.py
from json import loads
from requests import get

credential_url = 'https://grfn.asf.alaska.edu/door/credentials'
response = get(credential_url)
response.raise_for_status()

credentials = loads(response.text)['Credentials']

## This is how you use ASF s3 credentials with GDAL
from osgeo import gdal
gdal.SetConfigOption('AWS_REGION', 'us-east-1')
gdal.SetConfigOption('AWS_SECRET_ACCESS_KEY', credentials['SecretAccessKey'])
gdal.SetConfigOption('AWS_ACCESS_KEY_ID', credentials['AccessKeyId'])
gdal.SetConfigOption('AWS_SESSION_TOKEN', credentials['SessionToken'])

results = gdal.Info('/vsis3/grfn-content-prod/S1-GUNW-D-R-160-tops-20190710_20190628-162436-20935N_18926N-PP-6a53-v2_0_2.nc')
print(results)

@dbekaert
Copy link
Collaborator Author

dbekaert commented Aug 6, 2019

Thanks @piyushrpt. Im tagging @asjohnston-asf and @fjmeyer and will bring up at next GRFN meeting for some advice as well.

Here is a full example from material we are putting together for UNAVCO short course. Looks like there could an issue with conda's gdal installation that recognizes nc with groups as HDF5 when using vsis3 - might be good to communicate this to ASF. This has to do with detection of netcdf_mem.h when building GDAL.

@piyushrpt
Copy link
Collaborator

Turns out this is related to them building conda packages on old kernels:

userfaultfd support: no
https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=55722

This is unlikely to change anytime soon as they plan to support older kernels.

Piyush

@asjohnston-asf
Copy link

Thanks for the heads up. A few random comments on this thread:

Much of the information on our web site under https://www.asf.alaska.edu/sar_datasets/sentinel-1-interferograms-beta/ is out of date. For example, we retired all of our v1.x products when we upgraded to v2.x products, but all of our examples still refer to files that aren't in our archive anymore. The "long term storage" sections are also obsolete; we no longer move any files to long term storage.

If you prefer, you can also use /vsicurl directly against a file's grfn.asf.alaska.edu URL if you set the right environment variables. That could save you from having to do the s3 credential lookup. I can dig up the details if you're interested. The netcdf issue still applies, though.

After I wrote the example script I discovered requests has built-in json parsing, so you can save yourself an import: credentials = response.json()['Credentials']

@asjohnston-asf
Copy link

Last month ASF also rolled our our new "ASF Data Search" app at https://search.asf.alaska.edu/ . The old https://vertex.daac.asf.alaska.edu/ URL now redirects there. We've made it much easier to search for GUNW products, just select "S1 InSAR (BETA)" from the Dataset pulldown.
Untitled

@piyushrpt
Copy link
Collaborator

@asjohnston-asf Thanks for info on response having inbuilt json parsing. Could you point us to an example of using /vsicurl with the right environment variables? Is it just adding AWSKey etc to the URL itself?

@asjohnston-asf
Copy link

echo "machine urs.earthdata.nasa.gov login myUsername password myPassword" >> ~/.netrc
chmod 600 ~/.netrc
export GDAL_HTTP_COOKIEFILE=/tmp/cookies.txt
export GDAL_HTTP_COOKIEJAR=/tmp/cookies.txt
export CPL_VSIL_CURL_CHUNK_SIZE=10485760
gdalinfo /vsicurl/https://grfn.asf.alaska.edu/door/download/S1-GUNW-A-R-166-tops-20190717_20190705-014209-37440N_35563N-PP-ef82-v2_0_2.nc

Requires gdal >= v2.4.0. Chunk size isn't strictly necessary, but I've found it improves performance when accessing larger files.

@asjohnston-asf
Copy link

Same approach should work for any data hosted by ASF or EOSDIS, just need the netrc and cookie setup to deal with Earthdata Login.

@piyushrpt
Copy link
Collaborator

Thanks @asjohnston-asf . This also works ... the userfaultfd issue persists. I have opened a ticket here: conda-forge/gdal-feedstock#323

from osgeo import gdal
gdal.SetConfigOption('GDAL_HTTP_COOKIEFILE','cookies.txt')
gdal.SetConfigOption('GDAL_HTTP_COOKIEJAR', 'cookies.txt')
results = gdal.Info("/vsicurl/https://grfn.asf.alaska.edu/door/download/S1-GUNW-D-R-160-tops-20190710_20190628-162436-20935N_18926N-PP-6a53-v2_0_2.nc")
print(results)

@dbekaert
Copy link
Collaborator Author

Thanks for fixing this @piyushrpt.
Looks like this is now close: conda-forge/gdal-feedstock#323 (comment)

Will work on capturing the changes needed to leverage this in aria-tools here: #138

@dbekaert
Copy link
Collaborator Author

dbekaert commented Apr 16, 2020

@piyushrpt Im also getting h5 read. Do i need to increase the requirements in the file to make sure its capture as a netcdf. Running on Mac and Linux on the same issue.

@dbekaert dbekaert added this to the S3 support milestone Apr 16, 2020
@dbekaert
Copy link
Collaborator Author

@piyushrpt @asjohnston-asf I am not able to get the GUNW products recognized as netcdf products on mac or linux using conda installed gdal.

I have opened an issue ticket on the gdal-feedstock channel: conda-forge/gdal-feedstock#376

@dbekaert
Copy link
Collaborator Author

This has now been fixed in GDAL and should be returning the netcdf reader. See full discussion here: conda-forge/gdal-feedstock#376 (comment)

Next RC1 is planned for end of April, after which testing will happen. Once a new gdal version is released conda forge GDAL will pick up and we should be able to leverage directly.

@bbuzz31 do you have bandwidth for a test to confirm the reader works now?

  1. Linux kernel >=4.3 and libnetcdf >=4.5.
  2. need to build gdal from source.

@bbuzz31
Copy link
Collaborator

bbuzz31 commented Apr 19, 2020

@dbekaert unfortunately the highest version of Linux I have access to is 4.15

@dbekaert
Copy link
Collaborator Author

@bbuzz31 Tested this on opensarlabs server. Need to make a .netrc file and got succesfulle .ncdf reader being used:

In [2]: from osgeo import gdal
   ...: gdal.SetConfigOption('GDAL_HTTP_COOKIEFILE','cookies.txt')
   ...: gdal.SetConfigOption('GDAL_HTTP_COOKIEJAR', 'cookies.txt')
   ...: results = gdal.Info("/vsicurl/https://grfn.asf.alaska.edu/door/download/S1-GUNW-D-R-160-tops-20190710_20190628-1624
   ...: 36-20935N_18926N-PP-6a53-v2_0_2.nc")
   ...: print(results)
Driver: netCDF/Network Common Data Format
Files: /vsicurl/https://grfn.asf.alaska.edu/door/download/S1-GUNW-D-R-160-tops-20190710_20190628-162436-20935N_18926N-PP-6a53-v2_0_2.nc
Size is 512, 512
Metadata:

@dbekaert
Copy link
Collaborator Author

Closing this issue ticket now S3 support has been integrated.
Thank you all for the discussion and contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants