Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support virtual file access #138

Closed
dbekaert opened this issue Apr 12, 2020 · 6 comments
Closed

Support virtual file access #138

dbekaert opened this issue Apr 12, 2020 · 6 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@dbekaert
Copy link
Collaborator

dbekaert commented Apr 12, 2020

Tracking in this issue ticket the discussion with changes needed to support virtual data access from ASF S3 bucket directly.

ariaDownload.py - complete

Add option in ariaDownload.py to provide a list of where product patch wrt ASF S3.
Output should be able to be parsed into other aria*.py fucntions.
- Complete

ARIAproduct.py - complete

Remove dependency on local netcdf reader. Following locations should be updated modified

  1. product variable mapping:
    We could just hard-code the data-mapping for the different product versions and remove the need to leverage netcdf to search for the mapping. This should also speed up the process considerably at the cost of hard-coded code

    def __mappingData__(self, file, rmdkeys, sdskeys):

  2. Pair-name retrieval:
    Pair-name is retrieved from the netcdf using the original SLC granules for reference and secondary.
    Suggest replacing this with extracting from product name directly.

    read_file=self.netCDF4.Dataset(file, keepweakref=True).groups['science'].groups['radarMetaData'].groups['inputSLC']

  3. Product without group support
    Believe this can be removed:

    # If netcdf with nogroups

Testing of the extract modules - in progress

The call to GDAL should directly work as long as the path handling is done correction.
We could leverage S3 or vsi access: See examples here #27

Once above to are implemented would need some testing.

@bbuzz31
Copy link
Collaborator

bbuzz31 commented Apr 15, 2020

#140 added an option in ariaDownload.py for returning urls to GUNW products at ASF that can be directly accessed via vsicurl

let me know if this isn't as expected

@dbekaert dbekaert added this to the S3 support milestone Apr 16, 2020
@dbekaert
Copy link
Collaborator Author

@bbuzz31 ariaDownload.py with url option works as expected.

Need to resolve #27 as conda build of GDAL recognized netcdf as h5 files for macos (and also linux in my case). However linux should be working.
However a test assuming h5 seems to work to do a translate of amplitude out of the file:

from osgeo import gdal 
gdal.SetConfigOption('GDAL_HTTP_COOKIEFILE','cookies.txt') 
gdal.Translate("test.png", 'HDF5:/vsicurl/https://grfn.asf.alaska.edu/door/download/S1-GUNW-A-R-004-tops-20190512_20190406-230629-37655N_35779N-PP-3ea0-v2_0_2.nc://science/gr
   ...: ids/data/amplitude', options=gdal.TranslateOptions(format="PNG"))

test

@dbekaert
Copy link
Collaborator Author

Adding support in ARIAProducts:

           fname   = '/vsicurl/{}'.format(fname)
> from osgeo import gdal
fname='S1-GUNW-A-R-014-tops-20190718_20190706-152707-15346N_13427N-PP-1bc9-v2_0_2.nc'
test=gdal.Open(fname)
test.GetDriver().GetDescription()
'netCDF'

This means that we need to move part of code L110-116 here.
Also it avoids setting configuration multiple times

  • remove the testing for url and setting of configuration from the readproduct function.
    only keep line 119 from 110-123

@dbekaert
Copy link
Collaborator Author

@sssangha and I testing this the virtual aspects manually in opensarlabs. with the above changes we should be able to do more testing @rzinke @ehavazli @bbuzz31

@dbekaert
Copy link
Collaborator Author

dbekaert commented Apr 23, 2020

Successful validation of the VSI operations

Setup for opensarlab testing of virtual data access.

  1. Clone the branch from @ssangha
git clone https://github.com/sssangha/ARIA-tools.git
git checkout sss_urlupdate
git branch sss_urlupdate

Note that this has now been all merged and integrated in the dev branch

  1. Edit !/.bashrc and pre-pend the PYTHONPATH with that of the cloned directory
vim ~/.bashrc
export PYTHONPATH="/home/jovyan/ARIA-tools/tools":$PYTHONPATH
  1. Source the new .bashrs and verify the path is correctly
Source ~/.bashrc

Testing of code of virtual data access code

  1. need to have a .netrc file with earthdata log-on information.

  2. Download URL for Hawaii:

ariaDownload.py -t 124 -w products --ifg 20180408_20180502 -o url
https://api.daac.asf.alaska.edu/services/search/param?asfplatform=Sentinel-1%20Interferogram%20(BETA)&processingLevel=GUNW_STD&output=JSON&relativeOrbit=124
Wrote -- 2 -- product urls to: /home/jovyan/products/download_products_1240.txt
  1. Run the TS setup and extraction of all layers:
ariaTSsetup.py  -f "products/download_products_1240.txt" -v

Output of the command:

ariaTSsetup.py  -f "products/download_products_1240.txt" -v
***Time-series Preparation Function:***
Multi-core version
All (2) GUNW products meet spatial bbox criteria.
Group GUNW products into spatiotemporally continuous interferograms.
All (1) interferograms are spatially continuous.
Thread count specified for gdal multiprocessing = 2
Download/cropping DEM
Downloaded 3 arc-sec SRTM DEM here: ./DEM/SRTM_3arcsec.dem

Extracting unwrapped phase, coherence, and connected components for each interferogram pair
Generating: unwrappedPhase - [==================================================] 20180502_20180408
Generating: coherence - [==================================================] 20180502_20180408

Extracting single incidence angle, look angle and azimuth angle files valid over common interferometric grid
Generating: incidenceAngle - [==================================================] 20180502_20180408
Generating: lookAngle - [==================================================] 20180502_20180408
Generating: azimuthAngle - [==================================================] 20180502_20180408
Creating directory: ./stack
Number of coherence discovered:  1
[==================================================] 20180502_20180408
cohStack : stack generated
Directory ./stack already exists.
Number of connectedComponents discovered:  1
[==================================================] 20180502_20180408
connCompStack : stack generated
Directory ./stack already exists.
Number of unwrapped interferograms discovered:  1
[==================================================] 20180502_20180408
unwrapStack : stack generated

overview of the generated files:

jovyan@jupyter-dbekaert:~/demo2$ ls
azimuthAngle  connectedComponents  DEM             lookAngle           products  unwrappedPhase
coherence     cookies.txt          incidenceAngle  productBoundingBox  stack
jovyan@jupyter-dbekaert:~/demo2$ ls *
cookies.txt

azimuthAngle:
20180502_20180408  20180502_20180408.aux.xml  20180502_20180408.hdr  20180502_20180408.vrt

coherence:
20180502_20180408_uncropped.vrt  20180502_20180408.vrt

connectedComponents:
20180502_20180408  20180502_20180408.aux.xml  20180502_20180408.hdr  20180502_20180408.vrt

DEM:
SRTM_3arcsec.dem  SRTM_3arcsec.dem.aux.xml  SRTM_3arcsec.dem.vrt  SRTM_3arcsec.hdr  SRTM_3arcsec_uncropped.dem.vrt

incidenceAngle:
20180502_20180408  20180502_20180408.aux.xml  20180502_20180408.hdr  20180502_20180408.vrt

lookAngle:
20180502_20180408  20180502_20180408.aux.xml  20180502_20180408.hdr  20180502_20180408.vrt

productBoundingBox:
20180502_20180408.shp  productBoundingBox.shp

products:
download_products_1240.txt

stack:
cohStack.vrt  connCompStack.vrt  unwrapStack.vrt

unwrappedPhase:
20180502_20180408          20180502_20180408.hdr  20180502_20180408.png.aux.xml
20180502_20180408.aux.xml  20180502_20180408.png  20180502_20180408.vrt

Example of the unwrapped file:
Screen Shot 2020-04-22 at 5 14 44 PM

Example of a vrt gdal_translate :

gdal_translate -of png 20180502_20180408_uncropped.vrt 20180502_20180408_uncropped.png
jovyan@jupyter-dbekaert:~/demo2$ cd coherence/
gdal_translate -of png -scale 20180502_20180408_uncropped.vrt 20180502_20180408_uncropped.png
Input file size is 3686, 4286
Warning 6: PNG driver doesn't support data type Float32. Only eight bit (Byte) and sixteen bit (UInt16) bands supported. Defaulting to Byte

0...10...20...30...40...50...60...70...80...90...100 - done.

Example of the coherence file:
Screen Shot 2020-04-22 at 5 20 35 PM

Items to consider:

Setting the following environment variables ensures the vrt and multi-core option do not fail

export GDAL_HTTP_COOKIEFILE=/tmp/cookies.txt
export GDAL_HTTP_COOKIEJAR=/tmp/cookies.txt
export VSI_CACHE=YES
  1. vsi multiple times in parallel see here: vsicurl cache/concurrency issue when using multiple threads OSGeo/gdal#1244
  2. if not an environment variable the translate outside python fails as they are only internal to python

@bbuzz31 @sssangha
ACTION We should document global variables + .netrc in the ARIA-tools installation guide for leveraging VSI + the minimum requirements of kernal and libgdal.
ACTION Can we set these environment variables as part of the install process (@piyushrpt )?

@dbekaert
Copy link
Collaborator Author

README file has been updated with S3 support information.
Support has been included in the dev branch for S3 virtual data access.
Thank you all for the discussion and contributions.
This has been a big milestone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants