Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document machine-to-machine auth options in openeo #613

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 9 additions & 24 deletions APIs/openEO/Python_Client/Python.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -136,34 +136,18 @@ just like this:
connection.authenticate_oidc()
```

- By default, the first time you call this `authenticate_oidc()` method,
a URL will be printed. Something like for example:
- By default, the first time `authenticate_oidc()` is called,
instructions to visit a certain URL will be printed, e.g.:

```
Visit https://auth.example.com/device?user_code=EAXD-RQXV to authenticate.
Visit https://auth.example/?user_code=EAXD-RQXV to authenticate.
```

Visit this URL (click it or copy-paste it into your web browser)
and follow the login flow using your Copernicus Data Space Ecosystem credentials.

::: {.callout-tip collapse="true"}

You can visit this URL with any browser you prefer to complete the login procedure
(e.g. on your laptop or smartphone).
It does *not* have to be a browser running on the same machine/network as your Python script/application.

:::

Once the authentication is completed, your Python script will receive
the necessary authentication tokens and print

```
Authorized successfully.
```
Visit this URL and follow the login flow
using your Copernicus Data Space Ecosystem credentials.

- Other times, when you still have valid (refresh) tokens on your system,
it will not be necessary to go through the Copernicus Data Space Ecosystem login steps
and you will immediately see
the manual login process is skipped and you will immediately see

```
Authenticated using refresh token.
Expand All @@ -172,8 +156,9 @@ connection.authenticate_oidc()
In any case, your `connection` is now authenticated and capable to make download/processing requests.


A more in-depth discussion of various authentication concepts is available
in the [openEO Python client documentation](https://open-eo.github.io/openeo-python-client/auth.html){target="_blank"}.
Find more in-depth information on authentication
[here](../authentication.qmd)
or in the [openEO Python client documentation](https://open-eo.github.io/openeo-python-client/auth.html){target="_blank"}.


## Working with Datacube
Expand Down
220 changes: 220 additions & 0 deletions APIs/openEO/authentication.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
---
title: "Authentication in openEO"
---

While basic discovery of openEO collections and processes is possible without authentication,
executing openEO workflows requires user authentication.
This is necessary to manage user quotas, resources, and credit consumption effectively.
User authentication in openEO is handled with the OpenID Connect protocol (often abbreviated as "OIDC").

The openEO endpoint of the Copernicus Data Space Ecosystem is configured to use
the identity provider service of the Copernicus Data Space Ecosystem.
It is therefore recommended to complete the
[Copernicus Data Space Ecosystem registration](../../Registration.qmd)
before attempting to authenticate with openEO.



## Authentication Essentials With The openEO Python Client {#sec-typical-authentication-flow}

A typical Copernicus Data Space Ecosystem openEO workflow,
using the openEO Python client library,
starts with setting up a connection like this:

```python
import openeo

connection = openeo.connect(url="openeo.dataspace.copernicus.eu")
connection.authenticate_oidc()
```

After connecting to the Copernicus Data Space Ecosystem openEO endpoint,
[`Connection.authenticate_oidc()`](https://open-eo.github.io/openeo-python-client/api.html#openeo.rest.connection.Connection.authenticate_oidc){target="_blank"} initiates the OpenID Connect authentication flow:

- By default, the first time `authenticate_oidc()` is called,
instructions to visit a certain URL will be printed, e.g.:

```
Visit https://auth.example/?user_code=EAXD-RQXV to authenticate.
```

Visit this URL (click or copy-paste it into a web browser) and
follow the login flow using Copernicus Data Space Ecosystem credentials.

::: {.callout-tip collapse="false"}

Visit this URL using any preferred browser to complete the login procedure
(e.g., on a laptop or smartphone).
It does *not* need to be a browser on the same machine or network as the Python script/application.

:::

Once authentication is complete,
the Python script will receive the necessary authentication tokens and print

```
Authorized successfully.
```

- Other times, when valid (refresh) tokens are still present on the system,
the openEO Python client library will automatically use these tokens and
bypass the manual login process. The following will be visible immediately:

```
Authenticated using refresh token.
```

In any case, `connection` is now authenticated and ready to make download or processing requests.


## Alternative Authentication Methods

The [openEO Python client library documentation](https://open-eo.github.io/openeo-python-client/auth.html){target="_blank"}
has a more in-depth information of various authentication concepts,
and discusses alternative authentication methods.


## Non-interactive And Machine-to-Machine Authentication

@sec-typical-authentication-flow describes the typical authentication flow for interactive use cases,
e.g. when working in a Jupyter Notebook or manually running a script on a local machine.
This section will discuss authentication approaches
that are more fitting for *non-interactive* and *machine-to-machine* use cases.

::: {.callout-note}
The practical aspects will be based on the openEO Python client library,
but the concepts are also generally applicable to other openEO client libraries.
:::


### Refresh Tokens

Refresh tokens are long-lived tokens (order of weeks or months)
that can be used to obtain new access tokens
without the need for the user to re-authenticate.

As mentioned in @sec-typical-authentication-flow,
[`Connection.authenticate_oidc()`](https://open-eo.github.io/openeo-python-client/api.html#openeo.rest.connection.Connection.authenticate_oidc){target="_blank"}
will require user to go through an interactive login flow in a browser,
when there is no (valid) refresh token available.
But when the openEO Python client library can find a valid refresh token on this system,
this will be a non-interactive operation.
This makes it a viable option for non-interactive and machine-to-machine authentication,
if it is feasible to produce a new refresh token once in a while using an interactive login flow.

To get this working, there are basically two aspects to cover
(both of which have built-in support in the openEO Python client library):

1. Obtain and store a new refresh token.

There are several authentication methods on the `Connection` object
(e.g. the often used `authenticate_oidc()` method)
and most these have an option `store_refresh_token`
to enable storing of the refresh token obtained during the authentication process.
Note that this is enabled by default in `authenticate_oidc()`,
but not in `authenticate_oidc_device()`.

The refresh token is stored in a private JSON file, by default, in a folder within the personal data directory (typically determined by environment variables such as `XDG_DATA_HOME` on Linux or `APPDATA` on Windows). The folder can also be configured directly using the `OPENEO_CONFIG_HOME` environment variable.
The actual location can be verified
with the [`openeo-auth` command line tool](https://open-eo.github.io/openeo-python-client/auth.html#auth-config-files-and-openeo-auth-helper-tool){target="_blank"}.


2. Load and use the refresh token

When a valid refresh token is stored in a location accessible to the openEO Python client library,
the user can authenticate directly with:

```python
connection.authenticate_oidc_refresh_token()
```

Alternatively:

- If a user wants to keep the logic generic, the following can also be used

```python
connection.authenticate_oidc()
```

This method will first try to use a refresh token if available,
and fall back on other methods (e.g. device code flow) otherwise.

- If a user manages the storage and loading of the refresh token, it can be explicitly passed:

```python
connection.authenticate_oidc_refresh_token(
refresh_token=your_refresh_token,
)
```

::: {.callout-tip}
## Advanced refresh token storage

For advanced use cases, it is also possible to override the file-based
refresh token storage with a custom implementation through the
`refresh_token_store` parameter of
[`Connection`](https://open-eo.github.io/openeo-python-client/api.html#openeo.rest.connection.Connection){target="_blank"}.

:::



### Client Credentials Flow


OpenID Connect also supports the so-called "client credentials flow",
which is a non-interactive flow, based on a client id and a secret,
that is suitable for machine-to-machine authentication.

The openEO Python client library has built-in support for this flow,
with the ['authenticate_oidc_client_credentials()` method](https://open-eo.github.io/openeo-python-client/auth.html#oidc-authentication-client-credentials-flow){target="_blank"}:

```python
connection.authenticate_oidc_client_credentials(
client_id=...,
client_secret=..., # Note: Do not hard-code your secret here!
)
```

#### Obtaining Client Credentials

::: {.callout-warning}
The use of client credentials in openEO is an experimental feature.
It is not widely supported across openEO backends,
and the setup procedure is not yet fully standardized or streamlined.
:::

The Sentinel Hub service in the Copernicus Data Space Ecosystem
has a [dashboard web app](https://shapps.dataspace.copernicus.eu/dashboard)
and under the account settings there is a self-service feature to
[register your own OAuth client](../SentinelHub/Overview/Authentication.qmd#registering-oauth-client){target="_blank"}.
The client id and client secret obtain here can also be used
for the client credentials flow with the openEO service of Copernicus Data Space Ecosystem.

#### Caveats And Considerations

- Treat the client secret securely, like a password.
Take extra care to not leak it accidentally.
For example, the simplicity of the `authenticate_oidc_client_credentials()` example snippet above,
might be tempting to hard-code the client secret in scripts or notebooks, potentially leading to its permanent storage in version control repositories.

Instead, read the client secret from a secure location
(e.g. a private file outside the reach of the version control repositories),
or leverage environment variables (e.g as directly [supported by the openEO Python client library](https://open-eo.github.io/openeo-python-client/auth.html#oidc-client-credentials-using-environment-variables){target="_blank"}).
- The client credentials only identify OAuth client, not a personal user account.

- This means that openEO resources such as openEO batch jobs, their results, UDP's, etc
from one identity are not available to the other.
For example, batch jobs created with client credentials cannot be listed when using a personal account.
- Likewise, the balances of processing credits are separate.
However, it is possible to link the balance of the client credentials account to a personal account. To enable this, contact support and provide the client ID and user ID.

- The client credentials flow is not supported on the
[Copernicus Data Space Ecosystem openEO web editor](https://openeo.dataspace.copernicus.eu/).
As mentioned above,
this practically means that tracking the progress and
status of batch jobs created with client credentials is not possible.
However, it is still possible to approximate the batch job overview of the web editor
with a Jupyter Notebook using the openEO Python client library.


2 changes: 2 additions & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,8 @@ website:
text: openEO Processes
- href: "APIs/openEO/File_formats.qmd"
text: File Formats
- href: "APIs/openEO/authentication.qmd"
text: Authentication
- href: "https://open-eo.github.io/openeo-python-client"
text: Python documentation
target: "_blank"
Expand Down
Loading