-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add timeout to requests towards ETL API #690
Comments
Suggestion by @soxofaan: retry ETL API requests. https://github.com/eu-cdse/openeo-cdse-infra/issues/41 made it possible to retry ETL API requests without the risk of charging the user multiple times. The underlying problem was a large process graph that couldn't fit in the job's ZNode; this prevented the job from being marked as completed so it would be picked up again in subsequent JobTracker runs and the user would be charged again. So the suggestion is about retries within a particular JobTracker run rather than across JobTracker runs and still makes sense. |
ETL API requests should already be retried in sync requests and batch jobs because of respectively: openeo-geopyspark-driver/openeogeotrellis/backend.py Lines 1453 to 1468 in c390aaf
and openeo-geopyspark-driver/openeogeotrellis/job_costs_calculator.py Lines 105 to 116 in c390aaf
|
Could use a test. |
JobTracker was hanging on Terrascope and CDSE and had to be killed. Last line in the logs was:
Adding a timeout to the requests towards the ETL API should unblock JobTracker.
Note: this does not solve the underlying problem; when the timeout is reached, the batch job succeeds but the user might not be charged.
The text was updated successfully, but these errors were encountered: