Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Client has now a backend_type argument instead of debug + env variable #178

Merged
merged 1 commit into from
Sep 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 9 additions & 26 deletions docs/source/documentation/debug.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,54 +33,37 @@ The first step is to make sure your assets are working outside Substra. For inst

.. _local_mode:

Run tasks locally with the local mode
Run tasks locally with the local mode
-------------------------------------

All the tasks that can be run on a deployed network can also be run locally in your Python environment. The only change needed is to set the debug parameter to `True` when instantiating the client:
All the tasks that can be run on a deployed network can also be run locally in your Python environment. The only change needed is to set the backend_type parameter either to `subprocess` or `docker` when instantiating the client:
::

client = substra.Client.from_config_file(profile_name="org-1", debug=True)
client = substra.Client.from_config_file(profile_name="org-1", backend_type="subprocess")
client = substra.Client.from_config_file(profile_name="org-1", backend_type="docker")

Contrary to the default (remote) execution, the execution is done synchronously, so the script waits for the task in progress to end before continuing.

Two debug modes are available:
Two local modes are available:

* **Docker mode**: the execution of the tasks happens in Docker containers that are spawned on the fly and removed once the execution is done.
* **Subprocess mode**: the execution of the tasks happens in subprocesses (terminal commands executed from the Python code).

The default mode is the Docker mode. Set the environment variable `DEBUG_SPAWNER` to change this. This can for instance be done in the terminal:

* .. code-block:: bash

export DEBUG_SPAWNER=subprocess
* .. code-block:: bash

export DEBUG_SPAWNER=docker

Or with Python:

* ::

os.environ["DEBUG_SPAWNER"] = "subprocess”
* ::

os.environ["DEBUG_SPAWNER"] = "docker”

The subprocess mode is much faster than the Docker mode, but does not test that the Dockerfiles of the assets are valid, and may fail if advanced COPY or ADD commands are used in the Dockerfile. It is recommended to run your experiment locally in subprocess mode and when it is ready, test it with the Docker mode.

Local assets are saved in-memory, they have the same lifetime as the Client object (deleted at the end of the script).
Whenever a task fails, an error will be raised and logs of the tasks will be included in the error message. The logs of tasks that did not fail are not accessible.
Whenever a task fails, an error will be raised and logs of the tasks will be included in the error message. The logs of tasks that did not fail are not accessible.

.. _hybrid_mode:

Test remote assets locally with the hybrid mode
-----------------------------------------------

An hybrid step between testing everything locally and launching tasks on a deployed platform is to test locally remote assets. In this setting, the platform is accessed in read-only mode and any asset created is created locally. Experiments can be launched with a mix of remote and local assets. For instance using an algo from the deployed platform on a local dataset produces a local model.
To do so, instantiate a Client with the parameter `debug=True`:
An hybrid step between testing everything locally and launching tasks on a deployed platform is to test locally remote assets. In this setting, the platform is accessed in `read-only`` mode and any asset created is created locally. Experiments can be launched with a mix of remote and local assets. For instance using an algo from the deployed platform on a local dataset produces a local model.
To do so, instantiate a Client with the parameter `backend_type="subprocess"` or `backend_type="docker"`:
::

client = substra.Client.from_config_file(profile_name="org-1", debug=True)
client = substra.Client.from_config_file(profile_name="org-1", backend_type="subprocess")

and use remote assets when creating tasks. Any function to get, describe or download an asset works with assets from the deployed platform as well as with local assets. Functions to list assets list the assets from the platform and the local ones. However, unlike every other assets, models on the platform can not be used in local tasks. Moreover functions that create a new asset will only create local assets.

Expand Down
5 changes: 2 additions & 3 deletions examples/titanic_example/plot_titanic.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,13 +66,12 @@
#
# The client allows us to interact with the Substra platform. Setting the debug argument to ``True`` allow us to work locally by emulating a platform.
#
# By setting the environment variable ``DEBUG_SPAWNER`` to:
# By setting the argument ``backend_type`` to:
#
# - ``docker`` all tasks will be executed from docker containers (default)
# - ``subprocess`` all tasks will be executed from Python subprocesses (faster)

os.environ["DEBUG_SPAWNER"] = "subprocess"
client = substra.Client(debug=True)
client = substra.Client(backend_type="subprocess")

# %%
#
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
N_CLIENTS = 2

# Create the substra clients
clients = [Client(debug=True) for _ in range(N_CLIENTS)]
clients = [Client(backend_type="subprocess") for _ in range(N_CLIENTS)]
clients = {client.organization_info().organization_id: client for client in clients}

# Store their IDs
Expand All @@ -72,10 +72,6 @@
# The org id on which your computation tasks are registered
ALGO_ORG_ID = ORGS_ID[1]

# Choose the subprocess mode to locally simulate the FL process
DEBUG_SPAWNER = "subprocess"
os.environ["DEBUG_SPAWNER"] = DEBUG_SPAWNER

# Create the temporary directory for generated data
(pathlib.Path.cwd() / "tmp").mkdir(exist_ok=True)

Expand Down