Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Sentinel-1 ESA Orbit Availability and Rate Limiting #610

Closed
cmarshak opened this issue Nov 7, 2023 · 13 comments · Fixed by #609
Closed

[BUG] Sentinel-1 ESA Orbit Availability and Rate Limiting #610

cmarshak opened this issue Nov 7, 2023 · 13 comments · Fixed by #609
Assignees
Labels
bug Something isn't working

Comments

@cmarshak
Copy link
Collaborator

cmarshak commented Nov 7, 2023

I am writing this issue ticket totally second hand. (based on coversation with @jhkennedy and @asjohnston-asf). Moreover, this is an issue for cloud computing when we have numerous parallel requests to get orbits so is not a bug for processing run locally.

Here is the issue:

  1. We are utilizing https://github.com/scottstanie/sentineleof which utilizes the new ESA interface described here: https://github.com/scottstanie/sentineleof#update-2023-10-31-changes-to-sentinel-1-orbit-files-source. Here is the line we are using this:
    path_orb = eof.download.download_eofs([dt], [sat], save_dir=orbit_dir)
  2. From my understanding the new ESA interface is severly rate limited: 4 requests / minute (for a given user).

Here are the following questions that need to be answered before the next processing campaign:

  1. Will https://github.com/scottstanie/sentineleof error out if there are too many parallel requests to ESA even if it checks both ESA and ASF? @scottstanie will know.
  2. Should we just use force-asf option if 1 is true? https://github.com/scottstanie/sentineleof/blob/master/eof/download.py#L48
  3. Are there any downsides to using force-asf?
  4. Or we we could use https://github.com/ASFHyP3/hyp3-lib/blob/develop/hyp3lib/get_orb.py#L146 but this is a much heavier lift not within the code itself but due to mocking the tests.

This is not urgent, but will need to be resolved before the next processing campaign.

@cmarshak cmarshak added the bug Something isn't working label Nov 7, 2023
@scottstanie
Copy link
Contributor

I think it's 4 concurrent connections, rather than 4 requests per hour:
https://documentation.dataspace.copernicus.eu/Quotas.html#copernicus-general-users
image

  • I haven't yet stress tested the ESA limit (in fact I see I didn't even make it download orbits in parallel yet)
  • AFAIK, the only downside I see to the ASF version is the large initial file download of the whole list (which is big for RESORBs, not too bad for POEORBs)

@cmarshak
Copy link
Collaborator Author

cmarshak commented Nov 7, 2023

@scottstanie - just to be clear - if the ESA request throws an error (say because there are > 4 requests), would sentinelEOF check ASF?

@jhkennedy
Copy link
Collaborator

@scottstanie ASF has been hitting/playing around with getting orbits from the ESA interface, and right now, that limit is enforced like "4 downloads / minute / user". Even if we request them serially, with a brand new connection each time, we hit the rate limit if we don't limit ourselves to 4 downloads / minute.

scottstanie added a commit to scottstanie/sentineleof that referenced this issue Nov 7, 2023
also download in paralel

may work to help dbekaert/RAiDER#610
@scottstanie
Copy link
Contributor

@scottstanie - just to be clear - if the ESA request throws an error (say because there are > 4 requests), would sentinelEOF check ASF?

apparently it would just fail haha. but it looks like scottstanie/sentineleof#54 is working

@scottstanie ASF has been hitting/playing around with getting orbits from the ESA interface, and right now, that limit is enforced like "4 downloads / minute / user". Even if we request them serially, with a brand new connection each time, we hit the rate limit if we don't limit ourselves to 4 downloads / minute.

that's interesting, I dont think that's happening to me right now... I seem to be able to download the same 5 orbits in my testing, even when I do it a few times in a row:

``` (mapping) staniewi:sentineleof$ eof --max-workers 3 && rm -f *EOF && eof --max-workers 3 [11/07 17:05:25] [INFO download.py] Downloading precise orbits for S1A on 2016-09-26 [11/07 17:05:25] [INFO download.py] Downloading precise orbits for S1A on 2023-10-13 [11/07 17:05:25] [INFO download.py] Downloading precise orbits for S1A on 2016-10-20 [11/07 17:05:25] [INFO download.py] Downloading precise orbits for S1A on 2015-09-01 [11/07 17:05:25] [INFO download.py] Downloading precise orbits for S1A on 2014-10-03 [11/07 17:05:25] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:25] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:26] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:27] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:28] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:29] [INFO download.py] Attempting download from SciHub [11/07 17:05:33] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210314T074113_V20161019T225943_20161021T005943.EOF [11/07 17:05:33] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20231102T080652_V20231012T225942_20231014T005942.EOF [11/07 17:05:36] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210304T032019_V20141002T225944_20141004T005944.EOF [11/07 17:05:37] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210308T132525_V20150831T225943_20150902T005943.EOF [11/07 17:05:38] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210313T234922_V20160925T225943_20160927T005943.EOF [11/07 17:05:38] [INFO download.py] Downloading precise orbits for S1A on 2023-10-13 [11/07 17:05:38] [INFO download.py] Downloading precise orbits for S1A on 2016-10-20 [11/07 17:05:38] [INFO download.py] Downloading precise orbits for S1A on 2016-09-26 [11/07 17:05:38] [INFO download.py] Downloading precise orbits for S1A on 2014-10-03 [11/07 17:05:38] [INFO download.py] Downloading precise orbits for S1A on 2015-09-01 [11/07 17:05:38] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:39] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:40] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:41] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:42] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 17:05:43] [INFO download.py] Attempting download from SciHub [11/07 17:05:46] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210313T234922_V20160925T225943_20160927T005943.EOF [11/07 17:05:47] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20231102T080652_V20231012T225942_20231014T005942.EOF [11/07 17:05:49] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210304T032019_V20141002T225944_20141004T005944.EOF [11/07 17:05:50] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210314T074113_V20161019T225943_20161021T005943.EOF [11/07 17:05:51] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210308T132525_V20150831T225943_20150902T005943.EOF ```
Maybe it's because I'm trying the same files instead of all different ones?

But I do see the 429s when I try 5-6 parallel ones, it fails

``` (mapping) staniewi:sentineleof$ eof --max-workers 6 [11/07 16:58:52] [INFO download.py] Downloading precise orbits for S1A on 2016-10-20 [11/07 16:58:52] [INFO download.py] Downloading precise orbits for S1A on 2023-10-13 [11/07 16:58:52] [INFO download.py] Downloading precise orbits for S1A on 2015-09-01 [11/07 16:58:52] [INFO download.py] Downloading precise orbits for S1A on 2014-10-03 [11/07 16:58:52] [INFO download.py] Downloading precise orbits for S1A on 2016-09-26 [11/07 16:58:52] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 16:58:53] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 16:58:54] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 16:58:55] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 16:58:56] [INFO dataspace_client.py] Querying for AUX_POEORB orbit files from endpoint https://catalogue.dataspace.copernicus.eu/odata/v1/Products [11/07 16:58:57] [INFO download.py] Attempting download from SciHub [11/07 16:59:01] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210308T132525_V20150831T225943_20150902T005943.EOF [11/07 16:59:01] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210304T032019_V20141002T225944_20141004T005944.EOF [11/07 16:59:02] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20210314T074113_V20161019T225943_20161021T005943.EOF [11/07 16:59:06] [INFO dataspace_client.py] Orbit file downloaded to S1A_OPER_AUX_POEORB_OPOD_20231102T080652_V20231012T225942_20231014T005942.EOF [11/07 16:59:06] [WARNING download.py] Failed due to too many requests: ('429 Client Error: Too Many Requests for url: https://zipper.dataspace.copernicus.eu/odata/v1/Products(a385d95a-b194-4c65-a154-4e8895b4f5f6)/$value',) [11/07 16:59:06] [WARNING download.py] Dataspace failed, trying ASF [11/07 16:59:10] [INFO asf_client.py] Using cached EOF list [11/07 16:59:10] [INFO asf_client.py] https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20210314T074113_V20161019T225943_20161021T005943.EOF already exists, skipping download. [11/07 16:59:10] [INFO asf_client.py] https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20231102T080652_V20231012T225942_20231014T005942.EOF already exists, skipping download. [11/07 16:59:10] [INFO asf_client.py] https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20210308T132525_V20150831T225943_20150902T005943.EOF already exists, skipping download. [11/07 16:59:10] [INFO asf_client.py] https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20210304T032019_V20141002T225944_20141004T005944.EOF already exists, skipping download. [11/07 16:59:10] [INFO asf_client.py] Downloading https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20210313T234922_V20160925T225943_20160927T005943.EOF [11/07 16:59:10] [INFO download.py] Finished https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20210314T074113_V20161019T225943_20161021T005943.EOF, saved to S1A_OPER_AUX_POEORB_OPOD_20210314T074113_V20161019T225943_20161021T005943.EOF [11/07 16:59:10] [INFO download.py] Finished https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20231102T080652_V20231012T225942_20231014T005942.EOF, saved to S1A_OPER_AUX_POEORB_OPOD_20231102T080652_V20231012T225942_20231014T005942.EOF [11/07 16:59:10] [INFO download.py] Finished https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20210308T132525_V20150831T225943_20150902T005943.EOF, saved to S1A_OPER_AUX_POEORB_OPOD_20210308T132525_V20150831T225943_20150902T005943.EOF [11/07 16:59:10] [INFO download.py] Finished https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20210304T032019_V20141002T225944_20141004T005944.EOF, saved to S1A_OPER_AUX_POEORB_OPOD_20210304T032019_V20141002T225944_20141004T005944.EOF [11/07 16:59:12] [INFO asf_client.py] Saving to S1A_OPER_AUX_POEORB_OPOD_20210313T234922_V20160925T225943_20160927T005943.EOF [11/07 16:59:12] [INFO download.py] Finished https://s1qc.asf.alaska.edu/aux_poeorb/S1A_OPER_AUX_POEORB_OPOD_20210313T234922_V20160925T225943_20160927T005943.EOF, saved to S1A_OPER_AUX_POEORB_OPOD_20210313T234922_V20160925T225943_20160927T005943.EOF ```

scottstanie added a commit to scottstanie/sentineleof that referenced this issue Nov 8, 2023
* add catch for 429 error from CDSE

also download in paralel

may work to help dbekaert/RAiDER#610

* raise for other http errors

* bump version
@cmarshak
Copy link
Collaborator Author

So to resume the conversation here for clarity. It appears (based on limited testing by ASF) that the 4 open connections is what ESA's portal limits. The connection is open for such a short time, this might not be an issue, but it's unclear with a large processing campaign how this might expand further.

The newest release of SentinelEOF does check ESA's new portal and too many connections are open, resorts to ASF.

@jhkennedy
Copy link
Collaborator

jhkennedy commented Nov 14, 2023

@scottstanie - just to be clear - if the ESA request throws an error (say because there are > 4 requests), would sentinelEOF check ASF?

apparently it would just fail haha. but it looks like scottstanie/sentineleof#54 is working

@scottstanie nice! That'd definitely be helpful.

@scottstanie ASF has been hitting/playing around with getting orbits from the ESA interface, and right now, that limit is enforced like "4 downloads / minute / user". Even if we request them serially, with a brand new connection each time, we hit the rate limit if we don't limit ourselves to 4 downloads / minute.

that's interesting, I dont think that's happening to me right now... I seem to be able to download the same 5 orbits in my testing, even when I do it a few times in a row:

@scottstanie yes, this seems to have changed sometime around Nov. 3 and we're not seeing that issue in any of our pipelines anymore. We do see the 4 open connections limit as described by ESA and you posted earlier.

For a HyP3 like application, that 4 connection limit could be a problem if we're firing 1000s of jobs at the same time, but our limited testing of 100s of jobs hasn't hit the issue, I think because connections are open for such a short time. @cmarshak I think we can close this one and re-open it if we do see issues in production.

@cmarshak
Copy link
Collaborator Author

Isn't just the fix like putting lower bound on version of sentinelEOF?

@jhkennedy
Copy link
Collaborator

jhkennedy commented Nov 14, 2023

@cmarshak yes, I'll set the minimum bound of sentineleof to 0.9.5 or greater in #609

@jhkennedy
Copy link
Collaborator

jhkennedy commented Nov 15, 2023

@cmarshak and we'll need to update hyp3lib to v2.0.2 or greater... I'll do the same in #609.

We also should pick one way to get orbits; hyp3lib or sentineleof, but I may punt that to a follow-on PR as #609 is getting a bit large.

@jhkennedy
Copy link
Collaborator

Alright, in #609, I've dropped hyp3lib in favor of sentineleof as hyp3lib requires numpy <1.24, which prevents creating a Python 3.12 RAiDER environment.

@cmarshak
Copy link
Collaborator Author

cmarshak commented Nov 16, 2023

Uggg... it's going to be a decent lift then because all the S1 interpolation is so much easier querying for orbits based on S1 ids... maybe make hyp3-orbit-lib?

Let me know how I can help.

To be clear, all the interpolation using s1 orbits is built as "scaffolding" (outside of main function calls) because I don't want to actually touch the core raider code.

@jhkennedy
Copy link
Collaborator

@cmarshak
Copy link
Collaborator Author

cmarshak commented Nov 16, 2023

@cmarshak I think this function solves that concern:
https://github.com/dbekaert/RAiDER/pull/609/files#diff-0f23d66e0bff31ea8a9e6c6573d0c33605996a5e230004d08fd4ac5d501fbf2fR49-R68

Much easier than I thought - or you just did it so quickly and elegantly.

As a reminder to @asjohnston-asf for the discussion we had previously - I remembering there are annoying layers here:

(time stamp, geospatial area) ---> (slc_ids) ---> (orbits)

The first ---> is asf_search and the second ---> is now sentineleof.

So, it would be nice to have a way to have one arrow to search for orbits and a return could provide files and geospatial metadata that provided a spatial extent too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants