Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default env section to envs tutorial #810

Merged
merged 1 commit into from
May 20, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 120 additions & 10 deletions docs/tutorials/api-envs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,11 @@ variables, secrets, and working directory.
secrets=["aws"],
)

If no environment name is provided, when the environment is sent to a cluster,
the dependencies and variables of the environment will be installed and synced
on top of the cluster's default env. However, Without a name, the env resource
itself can not be accessed and does not live in the cluster's object store.
If no environment name is provided, it defaults to ``"base_env"``, which
corresponds to the base, catch-all environment on the cluster. If
multiple “base_env” environments are sent to a cluster, the dependencies
and variables will continue to be synced on top of the existing base
environment.

Conda Envs
~~~~~~~~~~
Expand Down Expand Up @@ -96,11 +97,11 @@ Envs on the Cluster
~~~~~~~~~~~~~~~~~~~

Runhouse environments are generic environments, and the object itself is
not associated with a cluster. However, it is a core component of
Runhouse services, like functions and modules, which are associated with
a cluster. As such, it is set up remotely when these services are sent
over to the cluster – packags are installed, working directory and env
vars/secrets synced over, and cached on the cluster.
not associated with a cluster. However, it is easy to set up an
environment on the cluster, by simply calling the ``env.to(cluster)``
API, or by sending your module/function to the env with the
``<rh_fn>.to(cluster=cluster, env=env)`` API, which will construct and
cache the environment on the remote cluster.

.. code:: ipython3

Expand All @@ -120,10 +121,15 @@ vars/secrets synced over, and cached on the cluster.
.. parsed-literal::
:class: code-output

INFO | 2024-02-28 21:24:52.915177 | Because this function is defined in a notebook, writing it out to /Users/caroline/Documents/runhouse/notebooks/docs/np_sum_fn.py. Please make sure the function does not rely on any local variables, including imports (which should be moved inside the function body).
INFO | 2024-02-28 21:24:52.915177 | Writing out function to /Users/caroline/Documents/runhouse/notebooks/docs/np_sum_fn.py. Please make sure the function does not rely on any local variables, including imports (which should be moved inside the function body).
INFO | 2024-02-28 21:25:03.923658 | SSH tunnel on to server's port 32300 via server's ssh port 22 already created with the cluster.
INFO | 2024-02-28 21:25:04.162828 | Server rh-cluster is up.
INFO | 2024-02-28 21:25:04.166104 | Copying package from file:///Users/caroline/Documents/runhouse/notebooks to: rh-cluster


.. parsed-literal::
:class: code-output

INFO | 2024-02-28 21:25:07.356780 | Calling np_env.install


Expand All @@ -144,6 +150,11 @@ vars/secrets synced over, and cached on the cluster.
:class: code-output

INFO | 2024-02-28 21:25:09.601131 | Time to call np_env.install: 2.24 seconds


.. parsed-literal::
:class: code-output

INFO | 2024-02-28 21:25:16.987243 | Sending module np_sum to rh-cluster


Expand Down Expand Up @@ -174,3 +185,102 @@ servlet, which handles all the activities within the environment
etc). Each env servlet has its own local object store where objects
persist in Python, and lives in its own process, reducing interprocess
overhead and eliminating launch overhead for calls made in the same env.

Cluster Default Env
^^^^^^^^^^^^^^^^^^^

The cluster also has a concept of a base default env, which is the
environment on which the runhouse server will be started from. It is the
environment in which cluster calls and computations, such as commands
and functions, will default to running on, if no other env is specified.

During cluster initialization, you can specify the default env for the
cluster. It is constructed as with any other runhouse env, using
``rh.env()``, and contains any package installations, commands to run,
or env vars to set prior to starting the Runhouse server, or even a
particular conda env to isolate your Runhouse environment. If no default
env is specified, runs on the base environment on the cluster (after
sourcing bash).

.. code:: ipython3

import runhouse as rh

.. code:: ipython3

default_env = rh.conda_env(
name="cluster_default",
reqs=["skypilot"], # to enable autostop, which requires skypilot library
working_dir="./",
env_vars={"my_token": "TOKEN_VAL"}
)
cluster = rh.ondemand_cluster(
name="rh-cpu",
instance_type="CPU:2+",
provider="aws",
default_env=default_env,
)
cluster.up_if_not()

Now, as we see in the examples below, running a command or sending over
a function without specifying an env will default the default conda env
that we have specified for the cluster.

.. code:: ipython3

cluster.run("conda env list | grep '*'")


.. parsed-literal::
:class: code-output

INFO | 2024-05-20 18:08:42.460946 | Calling cluster_default._run_command


.. parsed-literal::
:class: code-output

Running command in cluster_default: conda run -n cluster_default conda env list | grep '*'
cluster_default * /opt/conda/envs/cluster_default


.. parsed-literal::
:class: code-output

INFO | 2024-05-20 18:08:45.130137 | Time to call cluster_default._run_command: 2.67 seconds


.. parsed-literal::
:class: code-output

[(0, 'cluster_default * /opt/conda/envs/cluster_default\n', '')]


.. code:: ipython3

def check_import():
import sky
return "import succeeded"

.. code:: ipython3

check_remote_import = rh.function(check_import).to(cluster)

.. code:: ipython3

check_remote_import()


.. parsed-literal::
:class: code-output

INFO | 2024-05-20 18:30:05.128009 | Calling check_import.call
INFO | 2024-05-20 18:30:05.691348 | Time to call check_import.call: 0.56 seconds




.. parsed-literal::
:class: code-output

'import succeeded'
Loading