Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: set the mode directly in the client #287

Merged
merged 8 commits into from
Sep 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Changed

- Algo categories are not checked anymore in local mode. Validations based on inputs and outputs are sufficient.
- BREAKING CHANGE: the backend type is now set in the Client, the env variable `DEBUG_SPAWNER` is not used anymore. Default value is deployed (#287)

API before

```sh
export DEBUG_SPAWNER=subprocess
```

```python
client = substra.Client(debug=True)
```

API after

```python
client = substra.Client(backend_type=substra.BackendType.LOCAL_SUBPROCESS)
```

## [0.37.0](https://github.com/Substra/substra/releases/tag/0.37.0) - 2022-09-19

Expand Down
6 changes: 3 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
.PHONY: pyclean doc test
.PHONY: pyclean doc test doc-cli doc-sdk

pyclean:
find . -type f -name "*.py[co]" -delete
find . -type d -name "__pycache__" -delete
rm -rf build/ dist/ *.egg-info

doc-cli:
doc-cli: pyclean
python bin/generate_cli_documentation.py

doc-sdk:
doc-sdk: pyclean
python bin/generate_sdk_documentation.py
python bin/generate_sdk_schemas_documentation.py
python bin/generate_sdk_schemas_documentation.py --models --output-path='references/sdk_models.md'
Expand Down
32 changes: 21 additions & 11 deletions references/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

# Client
```text
Client(url: Optional[str] = None, token: Optional[str] = None, retry_timeout: int = 300, insecure: bool = False, debug: bool = False)
Client(url: Union[str, NoneType] = None, token: Union[str, NoneType] = None, retry_timeout: int = 300, insecure: bool = False, backend_type: substra.sdk.schemas.BackendType = <BackendType.REMOTE: 'deployed'>)
```

Create a client
Expand All @@ -21,11 +21,14 @@ Defaults to 5 minutes.
- `insecure (bool, optional)`: If True, the client can call a not-certified backend. This is
for development purposes.
Defaults to False.
- `debug (bool, optional)`: Whether to use the default or debug mode.
In debug mode, new assets are created locally but can access assets from
the deployed Substra platform. The platform is in read-only mode.
Defaults to False.
Additionally, you can set the environment variable `DEBUG_SPAWNER` to `docker` if you want the tasks to
- `backend_type (schemas.BackendType, optional)`: Which mode to use. Defaults to deployed.
In deployed mode, assets are registered on a deployed platform which also executes the tasks.
In local mode (subprocess or docker), if no URL is given then all assets are created locally and tasks are
executed locally.
In local mode (subprocess or docker), if a URL is given then the mode is a hybrid one: new assets are
created locally but can access assets from the deployed Substra platform. The platform is in read-only mode
and tasks are executed locally.
The local mode is either docker or subprocess mode: `docker` if you want the tasks to
be executed in containers (default) or `subprocess` to execute them in Python subprocesses (faster,
experimental: The `Dockerfile` commands are not executed, requires dependencies to be installed locally).
## backend_mode
Expand Down Expand Up @@ -384,7 +387,7 @@ algorithm.
- `pathlib.Path`: Path of the downloaded model
## from_config_file
```text
from_config_file(profile_name: str = 'default', config_path: Union[str, pathlib.Path] = '~/.substra', tokens_path: Union[str, pathlib.Path] = '~/.substra-tokens', token: Optional[str] = None, retry_timeout: int = 300, debug: bool = False)
from_config_file(profile_name: str = 'default', config_path: Union[str, pathlib.Path] = '~/.substra', tokens_path: Union[str, pathlib.Path] = '~/.substra-tokens', token: Union[str, NoneType] = None, retry_timeout: int = 300, backend_type: substra.sdk.schemas.BackendType = <BackendType.REMOTE: 'deployed'>)
```

Returns a new Client configured with profile data from configuration files.
Expand All @@ -401,10 +404,17 @@ Defaults to '~/.substra-tokens'.
instead of any token found at tokens_path). Defaults to None.
- `retry_timeout (int, optional)`: Number of seconds before attempting a retry call in case
of timeout. Defaults to 5 minutes.
- `debug (bool, required)`: Whether to use the default or debug mode. In debug mode, new assets are
created locally but can get remote assets. The deployed platform is in
read-only mode.
Defaults to False.
- `backend_type (schemas.BackendType, optional)`: Which mode to use. Defaults to deployed.
In deployed mode, assets are registered on a deployed platform which also executes the tasks.
In local mode (subprocess or docker), if no URL is given then all assets are created locally and tasks
are executed locally.
In local mode (subprocess or docker), if a URL is given then the mode is a hybrid one: new assets are
created locally but can access assets from the deployed Substra platform. The platform is in read-only
mode and tasks are executed locally.
The local mode is either docker or subprocess mode: `docker` if you want the tasks to
be executed in containers (default) or `subprocess` to execute them in Python subprocesses (faster,
experimental: The `Dockerfile` commands are not executed, requires dependencies to be installed
locally).

**Returns:**

Expand Down
2 changes: 1 addition & 1 deletion substra/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
from substra.__version__ import __version__
from substra.sdk import BackendType
from substra.sdk import Client
from substra.sdk import exceptions
from substra.sdk import models
from substra.sdk import schemas
from substra.sdk.schemas import BackendType

__all__ = [
"__version__",
Expand Down
2 changes: 1 addition & 1 deletion substra/sdk/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from substra.sdk import models
from substra.sdk import schemas
from substra.sdk.client import Client
from substra.sdk.config import BackendType
from substra.sdk.schemas import BackendType
from substra.sdk.utils import retry_on_exception

__all__ = [
Expand Down
8 changes: 5 additions & 3 deletions substra/sdk/backends/__init__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
from substra.sdk import schemas
from substra.sdk.backends.local.backend import Local
from substra.sdk.backends.remote.backend import Remote

_BACKEND_CHOICES = {
"remote": Remote,
"local": Local,
schemas.BackendType.REMOTE: Remote,
schemas.BackendType.LOCAL_DOCKER: Local,
schemas.BackendType.LOCAL_SUBPROCESS: Local,
}


def get(name, *args, **kwargs):
return _BACKEND_CHOICES[name.lower()](*args, **kwargs)
return _BACKEND_CHOICES[name](*args, **kwargs, backend_type=name)
2 changes: 1 addition & 1 deletion substra/sdk/backends/base.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import abc

from substra.sdk.config import BackendType
from substra.sdk.schemas import BackendType


class BaseBackend(abc.ABC):
Expand Down
18 changes: 5 additions & 13 deletions substra/sdk/backends/local/backend.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import copy
import logging
import os
import shutil
import typing
import warnings
Expand All @@ -23,7 +22,6 @@
from substra.sdk.backends import base
from substra.sdk.backends.local import compute
from substra.sdk.backends.local import dal
from substra.sdk.config import BackendType

logger = logging.getLogger(__name__)

Expand All @@ -35,23 +33,17 @@ class Local(base.BaseBackend):

org_counter = 1

def __init__(self, backend, *args, **kwargs):
def __init__(self, backend, backend_type, *args, **kwargs):
self._local_worker_dir = Path.cwd() / "local-worker"
self._local_worker_dir.mkdir(exist_ok=True)

self._debug_spawner = BackendType(os.getenv("DEBUG_SPAWNER", BackendType.LOCAL_DOCKER))
if self._debug_spawner == BackendType.LOCAL_SUBPROCESS:
logger.info(
"Environment variable DEBUG_SPAWNER is set to subprocess: "
"running Substra tasks with Python subprocess"
)

self._execution_mode = backend_type
# create a store to abstract the db
self._db = dal.DataAccess(backend, local_worker_dir=self._local_worker_dir)
self._worker = compute.Worker(
self._db,
local_worker_dir=self._local_worker_dir,
debug_spawner=self._debug_spawner,
debug_spawner=self._execution_mode,
)

self._org_id = f"MyOrg{Local.org_counter}MSP"
Expand All @@ -76,9 +68,9 @@ def temp_directory(self):
return self._db.tmp_dir

@property
def backend_mode(self) -> BackendType:
def backend_mode(self) -> schemas.BackendType:
"""Get the backend mode"""
return self._debug_spawner
return self._execution_mode

def login(self, username, password):
self._db.login(username, password)
Expand Down
2 changes: 1 addition & 1 deletion substra/sdk/backends/local/compute/spawner/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from substra.sdk.backends.local.compute.spawner.base import BaseSpawner
from substra.sdk.backends.local.compute.spawner.docker import Docker
from substra.sdk.backends.local.compute.spawner.subprocess import Subprocess
from substra.sdk.config import BackendType
from substra.sdk.schemas import BackendType

__all__ = ["BaseSpawner", "Docker", "Subprocess"]

Expand Down
8 changes: 4 additions & 4 deletions substra/sdk/backends/remote/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
from substra.sdk import schemas
from substra.sdk.backends import base
from substra.sdk.backends.remote import rest_client
from substra.sdk.config import BackendType

logger = logging.getLogger(__name__)

Expand All @@ -30,14 +29,15 @@ def _find_asset_field(data, field):


class Remote(base.BaseBackend):
def __init__(self, url, insecure, token, retry_timeout):
def __init__(self, url, insecure, token, retry_timeout, backend_type):
self._client = rest_client.Client(url, insecure, token)
self._retry_timeout = retry_timeout or DEFAULT_RETRY_TIMEOUT
assert backend_type == self.backend_mode

@property
def backend_mode(self) -> BackendType:
def backend_mode(self) -> schemas.BackendType:
"""Get the backend mode: deployed"""
return BackendType.DEPLOYED
return schemas.BackendType.REMOTE

def login(self, username, password):
return self._client.login(username, password)
Expand Down
77 changes: 48 additions & 29 deletions substra/sdk/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def wrapper(*args, **kwargs):
return wrapper


class Client(object):
class Client:
"""Create a client

Args:
Expand All @@ -61,11 +61,14 @@ class Client(object):
insecure (bool, optional): If True, the client can call a not-certified backend. This is
for development purposes.
Defaults to False.
debug (bool, optional): Whether to use the default or debug mode.
In debug mode, new assets are created locally but can access assets from
the deployed Substra platform. The platform is in read-only mode.
Defaults to False.
Additionally, you can set the environment variable `DEBUG_SPAWNER` to `docker` if you want the tasks to
backend_type (schemas.BackendType, optional): Which mode to use. Defaults to deployed.
In deployed mode, assets are registered on a deployed platform which also executes the tasks.
In local mode (subprocess or docker), if no URL is given then all assets are created locally and tasks are
executed locally.
In local mode (subprocess or docker), if a URL is given then the mode is a hybrid one: new assets are
created locally but can access assets from the deployed Substra platform. The platform is in read-only mode
and tasks are executed locally.
The local mode is either docker or subprocess mode: `docker` if you want the tasks to
be executed in containers (default) or `subprocess` to execute them in Python subprocesses (faster,
experimental: The `Dockerfile` commands are not executed, requires dependencies to be installed locally).
"""
Expand All @@ -76,39 +79,48 @@ def __init__(
token: Optional[str] = None,
retry_timeout: int = DEFAULT_RETRY_TIMEOUT,
insecure: bool = False,
debug: bool = False,
backend_type: schemas.BackendType = schemas.BackendType.REMOTE,
):
self._retry_timeout = retry_timeout
self._token = token

self._insecure = insecure
self._url = url

self._backend = self._get_backend(debug)
self._backend = self._get_backend(backend_type)

def _get_backend(self, debug: bool):
def _get_backend(self, backend_type: schemas.BackendType):
# Three possibilities:
# - debug is False: get a remote backend
# - debug is True and no url is defined: fully local backend
# - debug is True and url is defined: local backend that connects to
# a remote backend (read-only)
backend = None
if (debug and self._url) or not debug:
backend = backends.get(
"remote",
# - deployed: get a deployed backend
# - subprocess/docker and no url is defined: fully local backend
# - subprocess/docker and url is defined: local backend that connects to
# a deployed backend (read-only)
if backend_type == schemas.BackendType.REMOTE:
return backends.get(
schemas.BackendType.REMOTE,
url=self._url,
insecure=self._insecure,
token=self._token,
retry_timeout=self._retry_timeout,
)
if debug:
# Hybrid mode: the local backend also connects to
# a remote backend in read-only mode.
if backend_type in [schemas.BackendType.LOCAL_DOCKER, schemas.BackendType.LOCAL_SUBPROCESS]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if backend_type in [schemas.BackendType.LOCAL_DOCKER, schemas.BackendType.LOCAL_SUBPROCESS]:
elif backend_type in [schemas.BackendType.LOCAL_DOCKER, schemas.BackendType.LOCAL_SUBPROCESS]:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not need a elif since there is a return statement in the if above

backend = None
if self._url:
backend = backends.get(
schemas.BackendType.REMOTE,
url=self._url,
insecure=self._insecure,
token=self._token,
retry_timeout=self._retry_timeout,
)
return backends.get(
"local",
backend_type,
backend,
)
return backend
raise exceptions.SDKException(
f"Unknown value for the execution mode: {backend_type}, "
f"valid values are: {schemas.BackendType.__members__.values()}"
)

@property
def temp_directory(self):
Expand All @@ -119,7 +131,7 @@ def temp_directory(self):
return self._backend.temp_directory

@property
def backend_mode(self) -> cfg.BackendType:
def backend_mode(self) -> schemas.BackendType:
"""Get the backend mode: deployed,
local and which type of local mode

Expand Down Expand Up @@ -150,7 +162,7 @@ def from_config_file(
tokens_path: Union[str, pathlib.Path] = cfg.DEFAULT_TOKENS_PATH,
token: Optional[str] = None,
retry_timeout: int = DEFAULT_RETRY_TIMEOUT,
debug: bool = False,
backend_type: schemas.BackendType = schemas.BackendType.REMOTE,
):
"""Returns a new Client configured with profile data from configuration files.

Expand All @@ -166,10 +178,17 @@ def from_config_file(
instead of any token found at tokens_path). Defaults to None.
retry_timeout (int, optional): Number of seconds before attempting a retry call in case
of timeout. Defaults to 5 minutes.
debug (bool): Whether to use the default or debug mode. In debug mode, new assets are
created locally but can get remote assets. The deployed platform is in
read-only mode.
Defaults to False.
backend_type (schemas.BackendType, optional): Which mode to use. Defaults to deployed.
In deployed mode, assets are registered on a deployed platform which also executes the tasks.
In local mode (subprocess or docker), if no URL is given then all assets are created locally and tasks
are executed locally.
In local mode (subprocess or docker), if a URL is given then the mode is a hybrid one: new assets are
created locally but can access assets from the deployed Substra platform. The platform is in read-only
mode and tasks are executed locally.
The local mode is either docker or subprocess mode: `docker` if you want the tasks to
be executed in containers (default) or `subprocess` to execute them in Python subprocesses (faster,
experimental: The `Dockerfile` commands are not executed, requires dependencies to be installed
locally).

Returns:
Client: The new client.
Expand All @@ -188,7 +207,7 @@ def from_config_file(
retry_timeout=retry_timeout,
url=profile["url"],
insecure=profile["insecure"],
debug=debug,
backend_type=backend_type,
)

@logit
Expand Down
Loading