Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of openEO process #2645

Open
santilland opened this issue Jul 25, 2024 · 30 comments
Open

Integration of openEO process #2645

santilland opened this issue Jul 25, 2024 · 30 comments
Assignees
Labels
enhancement New feature or request

Comments

@santilland
Copy link
Collaborator

santilland commented Jul 25, 2024

It is expected that (at least) 3 openEO processes will become available which should be integrated into GTIF.
The integration concept would be similar to the one done for ship detections in the RACE instance.

The first openEO process to become available is expected to be PV farm detections.
It still needs to be understood where and by who the process will run "as a service".

If EOX should take care of integrating the openEO process one possibility of on-boarding would be through the Bring Your Own Algorithm (BYOA) approach.

@jamesemwheeler will coordinate the activity with EOX as to what will become available and potential integration steps that will need to be done in the future.

*edit: potential connection to #2396

@santilland santilland added the enhancement New feature or request label Jul 25, 2024
@santilland santilland added this to GTIF Jul 25, 2024
@santilland santilland moved this to Low Priority in GTIF Jul 25, 2024
@Patrick1G
Copy link

Should be called as an openEO UDP as below:

inference_result = conn.datacube_from_process(
process_id=UDP_ID,
spatial_extent={
"east": 16.414,
"north": 48.008,
"south": 47.962,
"west": 16.342
},

@Patrick1G
Copy link

@santilland @jamesemwheeler in terms of user account and authentication,
we might want to create a GTIF openEO user account, so that future UDPs (wind mill detection) can also be run accordingly.
This would have to be requested from the NOR..

@santilland
Copy link
Collaborator Author

@Patrick1G after discussion with our team, as you mentioned it would make most sense to create a dedicated eodashboard / GTIF project openEO account. We will look into making a NOR request for this.

@santilland
Copy link
Collaborator Author

Hello @Patrick1G , we tried to use cdse authentication (as we discussed) for openEO and run notebooks from the repository.

We tried to use the udp notebook (openeo_pv_farms_inference_udp.ipynb) to load the process into our openEO(CDSE) account, this seems to have worked, but trying to call the process (second part of the notebook) is producing an error.
Image
We experimented also by reducing the date range (keeping same bounds as original notebook) as the call was taking over 10 minutes.

@jamesemwheeler is there a way you can integrate the UDP into an account in a way that we can directly test it?

We also experimented in using the udf notebook - as the readme says (openeo_pv_farms_inference_udf.ipynb), which also produces errors when executing it without changing the example (using same variables)
Image
Image

@Patrick1G
Copy link

Patrick1G commented Sep 25, 2024

Hi @santilland,
maybe we should post this issue on the user forum, VITO colleagues are very active there and also Michele who developed the algo sees the posts: https://forum.openeo.cloud/

What could be the issue is that you have selected a quite small time period, but the UDP actually computes temporal metrics from the S2 data (min, max, median etc.). So instead of defining a 5 day time period, try it with 4-8 weeks...

@santilland
Copy link
Collaborator Author

@Patrick1G
we tried the example request which has multiple weeks ["2023-05-01", "2023-09-30"] it runs for over 20 minutes. This is in any case an issue on it's own, we would need to implement handling of asynchronous jobs.
But we still are getting:

Trying to construct a datacube with a bounds Extent(600075.7422801706, 5312933.126721961, 605569.3834671596, 5318171.680470274) that is not entirely inside the global bounds: Extent(600090.7400086118, 5312948.124328609, 605554.3859890398, 5318156.68286709).

Which for me sounds like a potential projection issue?

@jamesemwheeler did you run this successfully? Is my understanding correct that you have contact with the developer team?
Can you follow up with them?

@clausmichele
Copy link

Hi everyone, next week I will have a look at this!

@clausmichele
Copy link

clausmichele commented Oct 1, 2024

So, I've just checked and there was indeed an error in the notebook https://github.com/clausmichele/openEO_photovoltaic/blob/main/udf_inference/openeo_pv_farms_inference_udf.ipynb
It was cause by this additional comma after the spatial_extent dictionary. Removing the comma solves the problem and it runs fine. I updated the notebook in the repository.
image

@santilland
Copy link
Collaborator Author

Hello @clausmichele thank you for your support! Removing the comma seems to let the process run further, but now we get following error.

image

Any idea what the issue could be?

@clausmichele
Copy link

Do you get this error running the notebook as is, cloned from the repo? Using openEO Platform or CDSE?

@santilland
Copy link
Collaborator Author

The issue happens on CDSE by running as is

@clausmichele
Copy link

Well, then it's something I can't help with. Probably it's better to ask on the forum: https://forum.dataspace.copernicus.eu/
Anyway, @HansVRP helped on the VITO side for the UDF part.

@Patrick1G
Copy link

@clausmichele
Another thing to keep in mind; @santilland and colleagues are running the notebook from within EOxHub jupyterlab - this might make a difference in terms of client library versions (and other aspects?)...

@clausmichele
Copy link

Hi @Patrick1G, this really shouldn't matter, as long as the openeo library is up to date. I also tried with the same code connecting to CDSE and it fails for me too.

The code was developed within the openEO Platform project with VITO, please report this to them since they maintain both back-ends.

@santilland if you can, use openEO Platform (openeo.cloud) instead of CDSE as a workaround until they fix the problem.

@HansVRP
Copy link

HansVRP commented Oct 3, 2024

as part of APEX we will be porting this UPD over in the next 2 weeks. So I will immediately take a look at where the error comes from.

@Patrick1G
Copy link

@santilland any updates on running this with an openEO platform subscription?

@Patrick1G
Copy link

as part of APEX we will be porting this UPD over in the next 2 weeks. So I will immediately take a look at where the error comes from.

@HansVRP any updates on this investigation?

@HansVRP
Copy link

HansVRP commented Oct 15, 2024

currently being worked on in:

ESA-APEx/apex_algorithms#34

@HansVRP
Copy link

HansVRP commented Oct 15, 2024

Already managed to get a clean UDF running on CDSE backend. Wuld need to be validated. @clausmichele what is the standard spatiotemporal extent you used for validation

@HansVRP
Copy link

HansVRP commented Oct 16, 2024

working UDF and UDP can be found in the following PR:
Changes might still be made

ESA-APEx/apex_algorithms#44

@Patrick1G
Copy link

excellent news, and FYI @santilland @jamesemwheeler

@Patrick1G
Copy link

Hi @santilland

any progress on testing the on the fly inference with openEO??
I did not see any sponsorship request, so I assume you will use the UDP in CDSE?

Lets please prioritize this issue...

@Patrick1G
Copy link

Patrick1G commented Nov 11, 2024

@santilland @jamesemwheeler
I have a first benchmark with acceptable wall time for on the fly inference:
ID: vito-j-241111e5f2b74ea8b32422d85d54984e
Submitted: 11/11/2024, 2:51:17 PM UTC
Updated: 11/11/2024, 2:55:50 PM UTC
Image Dimensions: 132 × 213 (--> should be 2640x4260m --> 11.24km²)

Usage Metrics

  • CPU usage 2,779 cpu-seconds
  • Wall time 215 seconds
  • Input Pixel 4.125 mega-pixel
  • Memory usage 6,578,234 mb-seconds

and the process graph:
{ "process_graph": { "eurac_pv_farm_detection": { "arguments": { "spatial_extent": { "east": 12.971551618133702, "north": 41.77051819964615, "south": 41.759176626549845, "west": 12.946484944690127 }, "temporal_extent": [ "2023-07-01", "2023-07-31" ] }, "description": "An openEO process developed by EURAC to detect photovoltaic farms, based on sentinel2 data.", "process_id": "eurac_pv_farm_detection", "result": true } } }
Image

@santilland
Copy link
Collaborator Author

@Patrick1G @jamesemwheeler i am trying to save the new UDP in CDSE but i get an error about parameters not being defined for the UDP. I see in the discussion of the APEX integration ESA-APEx/apex_algorithms#44 (comment) that the UDP has some issues with this? And UDF needs to be used?
@jamesemwheeler have you had any luck exploring this?
Can we in theory now use the UDP or not?

@Patrick1G
Copy link

Hi @santilland
for me this worked without a problem using the openEO editor.

This is the URL that should be used for the UDP:
https://raw.githubusercontent.com/ESA-APEx/apex_algorithms/4003046e3b79ec3ab8dace888a231655db389d66/openeo_udp/eurac_pv_farm_detection/eurac_pv_farm_detection.json

we could connect tomorrow quickly to take a look at this together

@santilland
Copy link
Collaborator Author

@Patrick1G thanks for the quick reply and info, yes, loading it directly in the web editor worked, i had issues loading it with the python library, strange. Managed to execute it successfully in CDSE like that, perfect.

@santilland
Copy link
Collaborator Author

@Patrick1G @jamesemwheeler trying to summarize a the current status and exploration.
As part of integrating the process call in the dashboard client we explored the possibility of using the openeo js library. We had some initial issues with versions, but the main issue right now is handling of authentication.
After some documentation/code forensics and trying to create client credentials there is not really an easy way to execute a process without introducing a open id connect user flow, where he is directed to login at the provider (e.g. CDSE). Which is probably also not necessarily trivial, with handling redirects, etc. If this is an approach that is wanted we would need to discuss what this means as implementation effort and if it really provides value as a demonstrator.
If it is not an option i think the only other way would be to setup a proxy service, still this has some issues related to how an openid client can be setup (headless authentication), how to have the udp available for that client, and setting up the interface, etc. This is also not necessarily trivial.

@Patrick1G
Copy link

@santilland could you maybe request a bronze developer support package for openEO from VITO via the NOR? we need to make this work and it might require some expertise/work from both sides.

For the on the fly execution/inference we have a certain credit volume that could be assigned to "default gtif user" that we preauthenticate... This would need some quota constraints of course...

OpenEO platform uses EGI check in which would be preferred for now over the cdse authentication (cdse will use that layer but not yet)...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

5 participants