Skip to content

Commit

Permalink
Add default env section to envs tutorial (#810)
Browse files Browse the repository at this point in the history
  • Loading branch information
carolineechen committed May 20, 2024
1 parent 2529b79 commit 51c568d
Showing 1 changed file with 120 additions and 10 deletions.
130 changes: 120 additions & 10 deletions docs/tutorials/api-envs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,11 @@ variables, secrets, and working directory.
secrets=["aws"],
)
If no environment name is provided, when the environment is sent to a cluster,
the dependencies and variables of the environment will be installed and synced
on top of the cluster's default env. However, Without a name, the env resource
itself can not be accessed and does not live in the cluster's object store.
If no environment name is provided, it defaults to ``"base_env"``, which
corresponds to the base, catch-all environment on the cluster. If
multiple “base_env” environments are sent to a cluster, the dependencies
and variables will continue to be synced on top of the existing base
environment.

Conda Envs
~~~~~~~~~~
Expand Down Expand Up @@ -96,11 +97,11 @@ Envs on the Cluster
~~~~~~~~~~~~~~~~~~~

Runhouse environments are generic environments, and the object itself is
not associated with a cluster. However, it is a core component of
Runhouse services, like functions and modules, which are associated with
a cluster. As such, it is set up remotely when these services are sent
over to the cluster – packags are installed, working directory and env
vars/secrets synced over, and cached on the cluster.
not associated with a cluster. However, it is easy to set up an
environment on the cluster, by simply calling the ``env.to(cluster)``
API, or by sending your module/function to the env with the
``<rh_fn>.to(cluster=cluster, env=env)`` API, which will construct and
cache the environment on the remote cluster.

.. code:: ipython3
Expand All @@ -120,10 +121,15 @@ vars/secrets synced over, and cached on the cluster.
.. parsed-literal::
:class: code-output
INFO | 2024-02-28 21:24:52.915177 | Because this function is defined in a notebook, writing it out to /Users/caroline/Documents/runhouse/notebooks/docs/np_sum_fn.py. Please make sure the function does not rely on any local variables, including imports (which should be moved inside the function body).
INFO | 2024-02-28 21:24:52.915177 | Writing out function to /Users/caroline/Documents/runhouse/notebooks/docs/np_sum_fn.py. Please make sure the function does not rely on any local variables, including imports (which should be moved inside the function body).
INFO | 2024-02-28 21:25:03.923658 | SSH tunnel on to server's port 32300 via server's ssh port 22 already created with the cluster.
INFO | 2024-02-28 21:25:04.162828 | Server rh-cluster is up.
INFO | 2024-02-28 21:25:04.166104 | Copying package from file:///Users/caroline/Documents/runhouse/notebooks to: rh-cluster
.. parsed-literal::
:class: code-output
INFO | 2024-02-28 21:25:07.356780 | Calling np_env.install
Expand All @@ -144,6 +150,11 @@ vars/secrets synced over, and cached on the cluster.
:class: code-output
INFO | 2024-02-28 21:25:09.601131 | Time to call np_env.install: 2.24 seconds
.. parsed-literal::
:class: code-output
INFO | 2024-02-28 21:25:16.987243 | Sending module np_sum to rh-cluster
Expand Down Expand Up @@ -174,3 +185,102 @@ servlet, which handles all the activities within the environment
etc). Each env servlet has its own local object store where objects
persist in Python, and lives in its own process, reducing interprocess
overhead and eliminating launch overhead for calls made in the same env.

Cluster Default Env
^^^^^^^^^^^^^^^^^^^

The cluster also has a concept of a base default env, which is the
environment on which the runhouse server will be started from. It is the
environment in which cluster calls and computations, such as commands
and functions, will default to running on, if no other env is specified.

During cluster initialization, you can specify the default env for the
cluster. It is constructed as with any other runhouse env, using
``rh.env()``, and contains any package installations, commands to run,
or env vars to set prior to starting the Runhouse server, or even a
particular conda env to isolate your Runhouse environment. If no default
env is specified, runs on the base environment on the cluster (after
sourcing bash).

.. code:: ipython3
import runhouse as rh
.. code:: ipython3
default_env = rh.conda_env(
name="cluster_default",
reqs=["skypilot"], # to enable autostop, which requires skypilot library
working_dir="./",
env_vars={"my_token": "TOKEN_VAL"}
)
cluster = rh.ondemand_cluster(
name="rh-cpu",
instance_type="CPU:2+",
provider="aws",
default_env=default_env,
)
cluster.up_if_not()
Now, as we see in the examples below, running a command or sending over
a function without specifying an env will default the default conda env
that we have specified for the cluster.

.. code:: ipython3
cluster.run("conda env list | grep '*'")
.. parsed-literal::
:class: code-output
INFO | 2024-05-20 18:08:42.460946 | Calling cluster_default._run_command
.. parsed-literal::
:class: code-output
Running command in cluster_default: conda run -n cluster_default conda env list | grep '*'
cluster_default * /opt/conda/envs/cluster_default

.. parsed-literal::
:class: code-output
INFO | 2024-05-20 18:08:45.130137 | Time to call cluster_default._run_command: 2.67 seconds
.. parsed-literal::
:class: code-output
[(0, 'cluster_default * /opt/conda/envs/cluster_default\n', '')]
.. code:: ipython3
def check_import():
import sky
return "import succeeded"
.. code:: ipython3
check_remote_import = rh.function(check_import).to(cluster)
.. code:: ipython3
check_remote_import()
.. parsed-literal::
:class: code-output
INFO | 2024-05-20 18:30:05.128009 | Calling check_import.call
INFO | 2024-05-20 18:30:05.691348 | Time to call check_import.call: 0.56 seconds
.. parsed-literal::
:class: code-output
'import succeeded'

0 comments on commit 51c568d

Please sign in to comment.