Skip to content

Commit

Permalink
feat!: set the mode directly in the client (#287)
Browse files Browse the repository at this point in the history
  • Loading branch information
Esadruhn authored Sep 21, 2022
1 parent 942747f commit 09aed6e
Show file tree
Hide file tree
Showing 15 changed files with 127 additions and 97 deletions.
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Changed

- Algo categories are not checked anymore in local mode. Validations based on inputs and outputs are sufficient.
- BREAKING CHANGE: the backend type is now set in the Client, the env variable `DEBUG_SPAWNER` is not used anymore. Default value is deployed (#287)

API before

```sh
export DEBUG_SPAWNER=subprocess
```

```python
client = substra.Client(debug=True)
```

API after

```python
client = substra.Client(backend_type=substra.BackendType.LOCAL_SUBPROCESS)
```

### Removed

Expand Down
6 changes: 3 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
.PHONY: pyclean doc test
.PHONY: pyclean doc test doc-cli doc-sdk

pyclean:
find . -type f -name "*.py[co]" -delete
find . -type d -name "__pycache__" -delete
rm -rf build/ dist/ *.egg-info

doc-cli:
doc-cli: pyclean
python bin/generate_cli_documentation.py

doc-sdk:
doc-sdk: pyclean
python bin/generate_sdk_documentation.py
python bin/generate_sdk_schemas_documentation.py
python bin/generate_sdk_schemas_documentation.py --models --output-path='references/sdk_models.md'
Expand Down
32 changes: 21 additions & 11 deletions references/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

# Client
```text
Client(url: Optional[str] = None, token: Optional[str] = None, retry_timeout: int = 300, insecure: bool = False, debug: bool = False)
Client(url: Union[str, NoneType] = None, token: Union[str, NoneType] = None, retry_timeout: int = 300, insecure: bool = False, backend_type: substra.sdk.schemas.BackendType = <BackendType.REMOTE: 'deployed'>)
```

Create a client
Expand All @@ -21,11 +21,14 @@ Defaults to 5 minutes.
- `insecure (bool, optional)`: If True, the client can call a not-certified backend. This is
for development purposes.
Defaults to False.
- `debug (bool, optional)`: Whether to use the default or debug mode.
In debug mode, new assets are created locally but can access assets from
the deployed Substra platform. The platform is in read-only mode.
Defaults to False.
Additionally, you can set the environment variable `DEBUG_SPAWNER` to `docker` if you want the tasks to
- `backend_type (schemas.BackendType, optional)`: Which mode to use. Defaults to deployed.
In deployed mode, assets are registered on a deployed platform which also executes the tasks.
In local mode (subprocess or docker), if no URL is given then all assets are created locally and tasks are
executed locally.
In local mode (subprocess or docker), if a URL is given then the mode is a hybrid one: new assets are
created locally but can access assets from the deployed Substra platform. The platform is in read-only mode
and tasks are executed locally.
The local mode is either docker or subprocess mode: `docker` if you want the tasks to
be executed in containers (default) or `subprocess` to execute them in Python subprocesses (faster,
experimental: The `Dockerfile` commands are not executed, requires dependencies to be installed locally).
## backend_mode
Expand Down Expand Up @@ -384,7 +387,7 @@ algorithm.
- `pathlib.Path`: Path of the downloaded model
## from_config_file
```text
from_config_file(profile_name: str = 'default', config_path: Union[str, pathlib.Path] = '~/.substra', tokens_path: Union[str, pathlib.Path] = '~/.substra-tokens', token: Optional[str] = None, retry_timeout: int = 300, debug: bool = False)
from_config_file(profile_name: str = 'default', config_path: Union[str, pathlib.Path] = '~/.substra', tokens_path: Union[str, pathlib.Path] = '~/.substra-tokens', token: Union[str, NoneType] = None, retry_timeout: int = 300, backend_type: substra.sdk.schemas.BackendType = <BackendType.REMOTE: 'deployed'>)
```

Returns a new Client configured with profile data from configuration files.
Expand All @@ -401,10 +404,17 @@ Defaults to '~/.substra-tokens'.
instead of any token found at tokens_path). Defaults to None.
- `retry_timeout (int, optional)`: Number of seconds before attempting a retry call in case
of timeout. Defaults to 5 minutes.
- `debug (bool, required)`: Whether to use the default or debug mode. In debug mode, new assets are
created locally but can get remote assets. The deployed platform is in
read-only mode.
Defaults to False.
- `backend_type (schemas.BackendType, optional)`: Which mode to use. Defaults to deployed.
In deployed mode, assets are registered on a deployed platform which also executes the tasks.
In local mode (subprocess or docker), if no URL is given then all assets are created locally and tasks
are executed locally.
In local mode (subprocess or docker), if a URL is given then the mode is a hybrid one: new assets are
created locally but can access assets from the deployed Substra platform. The platform is in read-only
mode and tasks are executed locally.
The local mode is either docker or subprocess mode: `docker` if you want the tasks to
be executed in containers (default) or `subprocess` to execute them in Python subprocesses (faster,
experimental: The `Dockerfile` commands are not executed, requires dependencies to be installed
locally).

**Returns:**

Expand Down
2 changes: 1 addition & 1 deletion substra/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
from substra.__version__ import __version__
from substra.sdk import BackendType
from substra.sdk import Client
from substra.sdk import exceptions
from substra.sdk import models
from substra.sdk import schemas
from substra.sdk.schemas import BackendType

__all__ = [
"__version__",
Expand Down
2 changes: 1 addition & 1 deletion substra/sdk/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from substra.sdk import models
from substra.sdk import schemas
from substra.sdk.client import Client
from substra.sdk.config import BackendType
from substra.sdk.schemas import BackendType
from substra.sdk.utils import retry_on_exception

__all__ = [
Expand Down
8 changes: 5 additions & 3 deletions substra/sdk/backends/__init__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
from substra.sdk import schemas
from substra.sdk.backends.local.backend import Local
from substra.sdk.backends.remote.backend import Remote

_BACKEND_CHOICES = {
"remote": Remote,
"local": Local,
schemas.BackendType.REMOTE: Remote,
schemas.BackendType.LOCAL_DOCKER: Local,
schemas.BackendType.LOCAL_SUBPROCESS: Local,
}


def get(name, *args, **kwargs):
return _BACKEND_CHOICES[name.lower()](*args, **kwargs)
return _BACKEND_CHOICES[name](*args, **kwargs, backend_type=name)
2 changes: 1 addition & 1 deletion substra/sdk/backends/base.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import abc

from substra.sdk.config import BackendType
from substra.sdk.schemas import BackendType


class BaseBackend(abc.ABC):
Expand Down
18 changes: 5 additions & 13 deletions substra/sdk/backends/local/backend.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import copy
import logging
import os
import shutil
import typing
import warnings
Expand All @@ -23,7 +22,6 @@
from substra.sdk.backends import base
from substra.sdk.backends.local import compute
from substra.sdk.backends.local import dal
from substra.sdk.config import BackendType

logger = logging.getLogger(__name__)

Expand All @@ -35,23 +33,17 @@ class Local(base.BaseBackend):

org_counter = 1

def __init__(self, backend, *args, **kwargs):
def __init__(self, backend, backend_type, *args, **kwargs):
self._local_worker_dir = Path.cwd() / "local-worker"
self._local_worker_dir.mkdir(exist_ok=True)

self._debug_spawner = BackendType(os.getenv("DEBUG_SPAWNER", BackendType.LOCAL_DOCKER))
if self._debug_spawner == BackendType.LOCAL_SUBPROCESS:
logger.info(
"Environment variable DEBUG_SPAWNER is set to subprocess: "
"running Substra tasks with Python subprocess"
)

self._execution_mode = backend_type
# create a store to abstract the db
self._db = dal.DataAccess(backend, local_worker_dir=self._local_worker_dir)
self._worker = compute.Worker(
self._db,
local_worker_dir=self._local_worker_dir,
debug_spawner=self._debug_spawner,
debug_spawner=self._execution_mode,
)

self._org_id = f"MyOrg{Local.org_counter}MSP"
Expand All @@ -76,9 +68,9 @@ def temp_directory(self):
return self._db.tmp_dir

@property
def backend_mode(self) -> BackendType:
def backend_mode(self) -> schemas.BackendType:
"""Get the backend mode"""
return self._debug_spawner
return self._execution_mode

def login(self, username, password):
self._db.login(username, password)
Expand Down
2 changes: 1 addition & 1 deletion substra/sdk/backends/local/compute/spawner/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from substra.sdk.backends.local.compute.spawner.base import BaseSpawner
from substra.sdk.backends.local.compute.spawner.docker import Docker
from substra.sdk.backends.local.compute.spawner.subprocess import Subprocess
from substra.sdk.config import BackendType
from substra.sdk.schemas import BackendType

__all__ = ["BaseSpawner", "Docker", "Subprocess"]

Expand Down
8 changes: 4 additions & 4 deletions substra/sdk/backends/remote/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
from substra.sdk import schemas
from substra.sdk.backends import base
from substra.sdk.backends.remote import rest_client
from substra.sdk.config import BackendType

logger = logging.getLogger(__name__)

Expand All @@ -30,14 +29,15 @@ def _find_asset_field(data, field):


class Remote(base.BaseBackend):
def __init__(self, url, insecure, token, retry_timeout):
def __init__(self, url, insecure, token, retry_timeout, backend_type):
self._client = rest_client.Client(url, insecure, token)
self._retry_timeout = retry_timeout or DEFAULT_RETRY_TIMEOUT
assert backend_type == self.backend_mode

@property
def backend_mode(self) -> BackendType:
def backend_mode(self) -> schemas.BackendType:
"""Get the backend mode: deployed"""
return BackendType.DEPLOYED
return schemas.BackendType.REMOTE

def login(self, username, password):
return self._client.login(username, password)
Expand Down
77 changes: 48 additions & 29 deletions substra/sdk/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def wrapper(*args, **kwargs):
return wrapper


class Client(object):
class Client:
"""Create a client
Args:
Expand All @@ -66,11 +66,14 @@ class Client(object):
insecure (bool, optional): If True, the client can call a not-certified backend. This is
for development purposes.
Defaults to False.
debug (bool, optional): Whether to use the default or debug mode.
In debug mode, new assets are created locally but can access assets from
the deployed Substra platform. The platform is in read-only mode.
Defaults to False.
Additionally, you can set the environment variable `DEBUG_SPAWNER` to `docker` if you want the tasks to
backend_type (schemas.BackendType, optional): Which mode to use. Defaults to deployed.
In deployed mode, assets are registered on a deployed platform which also executes the tasks.
In local mode (subprocess or docker), if no URL is given then all assets are created locally and tasks are
executed locally.
In local mode (subprocess or docker), if a URL is given then the mode is a hybrid one: new assets are
created locally but can access assets from the deployed Substra platform. The platform is in read-only mode
and tasks are executed locally.
The local mode is either docker or subprocess mode: `docker` if you want the tasks to
be executed in containers (default) or `subprocess` to execute them in Python subprocesses (faster,
experimental: The `Dockerfile` commands are not executed, requires dependencies to be installed locally).
"""
Expand All @@ -81,39 +84,48 @@ def __init__(
token: Optional[str] = None,
retry_timeout: int = DEFAULT_RETRY_TIMEOUT,
insecure: bool = False,
debug: bool = False,
backend_type: schemas.BackendType = schemas.BackendType.REMOTE,
):
self._retry_timeout = retry_timeout
self._token = token

self._insecure = insecure
self._url = url

self._backend = self._get_backend(debug)
self._backend = self._get_backend(backend_type)

def _get_backend(self, debug: bool):
def _get_backend(self, backend_type: schemas.BackendType):
# Three possibilities:
# - debug is False: get a remote backend
# - debug is True and no url is defined: fully local backend
# - debug is True and url is defined: local backend that connects to
# a remote backend (read-only)
backend = None
if (debug and self._url) or not debug:
backend = backends.get(
"remote",
# - deployed: get a deployed backend
# - subprocess/docker and no url is defined: fully local backend
# - subprocess/docker and url is defined: local backend that connects to
# a deployed backend (read-only)
if backend_type == schemas.BackendType.REMOTE:
return backends.get(
schemas.BackendType.REMOTE,
url=self._url,
insecure=self._insecure,
token=self._token,
retry_timeout=self._retry_timeout,
)
if debug:
# Hybrid mode: the local backend also connects to
# a remote backend in read-only mode.
if backend_type in [schemas.BackendType.LOCAL_DOCKER, schemas.BackendType.LOCAL_SUBPROCESS]:
backend = None
if self._url:
backend = backends.get(
schemas.BackendType.REMOTE,
url=self._url,
insecure=self._insecure,
token=self._token,
retry_timeout=self._retry_timeout,
)
return backends.get(
"local",
backend_type,
backend,
)
return backend
raise exceptions.SDKException(
f"Unknown value for the execution mode: {backend_type}, "
f"valid values are: {schemas.BackendType.__members__.values()}"
)

@property
def temp_directory(self):
Expand All @@ -124,7 +136,7 @@ def temp_directory(self):
return self._backend.temp_directory

@property
def backend_mode(self) -> cfg.BackendType:
def backend_mode(self) -> schemas.BackendType:
"""Get the backend mode: deployed,
local and which type of local mode
Expand Down Expand Up @@ -155,7 +167,7 @@ def from_config_file(
tokens_path: Union[str, pathlib.Path] = cfg.DEFAULT_TOKENS_PATH,
token: Optional[str] = None,
retry_timeout: int = DEFAULT_RETRY_TIMEOUT,
debug: bool = False,
backend_type: schemas.BackendType = schemas.BackendType.REMOTE,
):
"""Returns a new Client configured with profile data from configuration files.
Expand All @@ -171,10 +183,17 @@ def from_config_file(
instead of any token found at tokens_path). Defaults to None.
retry_timeout (int, optional): Number of seconds before attempting a retry call in case
of timeout. Defaults to 5 minutes.
debug (bool): Whether to use the default or debug mode. In debug mode, new assets are
created locally but can get remote assets. The deployed platform is in
read-only mode.
Defaults to False.
backend_type (schemas.BackendType, optional): Which mode to use. Defaults to deployed.
In deployed mode, assets are registered on a deployed platform which also executes the tasks.
In local mode (subprocess or docker), if no URL is given then all assets are created locally and tasks
are executed locally.
In local mode (subprocess or docker), if a URL is given then the mode is a hybrid one: new assets are
created locally but can access assets from the deployed Substra platform. The platform is in read-only
mode and tasks are executed locally.
The local mode is either docker or subprocess mode: `docker` if you want the tasks to
be executed in containers (default) or `subprocess` to execute them in Python subprocesses (faster,
experimental: The `Dockerfile` commands are not executed, requires dependencies to be installed
locally).
Returns:
Client: The new client.
Expand All @@ -193,7 +212,7 @@ def from_config_file(
retry_timeout=retry_timeout,
url=profile["url"],
insecure=profile["insecure"],
debug=debug,
backend_type=backend_type,
)

@logit
Expand Down
Loading

0 comments on commit 09aed6e

Please sign in to comment.