Skip to content

Latest commit

 

History

History
437 lines (339 loc) · 22.5 KB

README.md

File metadata and controls

437 lines (339 loc) · 22.5 KB

clearml-session
CLI for launching JupyterLab / VSCode / SSH on a remote machine

🔥 NEW in version 0.13 Workspace Syncing 🚀

GitHub license PyPI pyversions PyPI version shields.io PyPI status Slack Channel

🌟 ClearML is open-source - Leave a star to support the project! 🌟

clearml-session is a utility for launching detachable remote interactive sessions (MacOS, Windows, Linux)

tl;dr

CLI to launch a remote session of Jupyter-Lab / VSCode / SSH, inside any docker container on any deployment, Cloud / Kubernetes / Bare-Metal

🔰 What does it do?

Starting a clearml (ob)session from your local machine triggers the following:

  • ClearML allocates a remote instance (GPU) from your dedicated pool
  • On the allocated instance it will spin jupyter-lab + vscode server + SSH access for interactive usage (i.e., development)
  • ClearML will start monitoring machine performance, allowing DevOps to detect stale instances and spin them down
  • NEW 🔥 Kubernetes support, develop directly inside your pods! No kubectl required! Read more about clearml-agent and interactive sessions here
  • NEW 🎉 Automatically store & sync your interactive session workspace. clearml-session will automatically create a snapshot of your entire workspace when shutting it down, and later restore into a new session on a different remote machine

ℹ️ Remote PyCharm: You can also work with PyCharm in a remote session over SSH. Use the PyCharm Plugin to automatically sync local configurations with a remote session.

Use-cases for remote interactive sessions:

  1. Development requires resources not available on the current developer's machines
  2. Team resource sharing (e.g. how to dynamically assign GPUs to developers)
  3. Spin a copy of a previously executed experiment for remote debugging purposes (:open_mouth:!)
  4. Scale-out development to multiple clouds, assign development machines on AWS/GCP/Azure in a seamless way

Prerequisites:

  • An SSH client installed on your machine - To verify, open your terminal and execute ssh. If you did not receive an error, we are good to go.
  • At least one clearml-agent running on a remote host. See installation details.

Supported OS: MacOS, Windows, Linux

🔒 Secure & Stable

clearml-session creates a single, secure, and encrypted connection to the remote machine over SSH. SSH credentials are automatically generated by the CLI and contain fully random 32 bytes password.

All http connections are tunneled over the SSH connection, allowing users to add additional services on the remote machine (!)

Furthermore, all tunneled connections have a special stable network layer allowing you to refresh the underlying SSH connection without breaking any network sockets!

This means that if the network connection is unstable, you can refresh the base SSH network tunnel, without breaking JupyterLab/VSCode-server or your own SSH connection (e.g. debugging over SSH with PyCharm)


⚡ How to use: Interactive Session

  1. Run clearml-session
  2. Select the requested queue (resource)
  3. Wait until a machine is up and ready
  4. Click on the link to the remote JupyterLab/VSCode OR connect with the provided SSH details

Notice! You can also: Select a docker image to execute in, install required python packages, run bash script, pass git credentials, etc. See below for full CLI options.

📖 Tutorials

👉 Getting started

Requirements clearml python package installed and configured (see detailed instructions)

pip install clearml-session
clearml-session --docker nvcr.io/nvidia/pytorch:20.11-py3 --git-credentials

Wait for the machine to spin up. Expected CLI output would look something like:

Creating new session
New session created [id=3d38e738c5ff458a9ec465e77e19da23]
Waiting for remote machine allocation [id=3d38e738c5ff458a9ec465e77e19da23]
.Status [queued]
....Status [in_progress]
Remote machine allocated
Setting remote environment [Task id=3d38e738c5ff458a9ec465e77e19da23]
Setup process details: https://app.community.clear.ml/projects/64ae77968db24b27abf86a501667c330/experiments/3d38e738c5ff458a9ec465e77e19da23/output/log
Waiting for environment setup to complete [usually about 20-30 seconds]
..............
Remote machine is ready
Setting up connection to remote session
Starting SSH tunnel
Warning: Permanently added '[192.168.0.17]:10022' (ECDSA) to the list of known hosts.
root@192.168.0.17's password: f7bae03235ff2a62b6bfbc6ab9479f9e28640a068b1208b63f60cb097b3a1784


Interactive session is running:
SSH: ssh root@localhost -p 8022 [password: f7bae03235ff2a62b6bfbc6ab9479f9e28640a068b1208b63f60cb097b3a1784]
Jupyter Lab URL: http://localhost:8878/?token=df52806d36ad30738117937507b213ac14ed638b8c336a7e
VSCode server available at http://localhost:8898/

Connection is up and running
Enter "r" (or "reconnect") to reconnect the session (for example after suspend)
`s` (or "shell") to connect to the SSH session
`Ctrl-C` (or "quit") to abort (remote session remains active)
or "Shutdown" to shut down remote interactive session

Click on the Jupyter Lab link (http://localhost:8878/?token=xyz), or VScode (running inside your remote container) (http://localhost:8898/), or drop into SSH shell by typing shell.

Open your terminal, clone your code & start working :)

ℹ️ TIP: You can additional python package to your remote session setup by adding --packages to the command line, for example to add boto3 add --packages "boto3>1"

ℹ️ TIP: If you need direct SSH into the remote container from your terminal, you can directly drop into a shell by adding --shell to the command line

📞 Leaving a session and reconnecting to it

On the clearml-session CLI terminal, enter 'quit' or press Ctrl-C It will close the CLI but preserve the remote session (i.e. remote session will remain running)

When you want to reconnect to it, execute:

clearml-session

Then press "Y" (or enter) to reconnect to the already running session:

clearml-session - launch interactive session
Checking previous session
Connect to active session id=3d38e738c5ff458a9ec465e77e19da23 [Y]/n?

Shutting down a remote session

On the clearml-session CLI terminal, enter 'shutdown' (case-insensitive). It will shut down the remote session, free the resource and close the CLI:

Enter "r" (or "reconnect") to reconnect the session (for example after suspend)
`s` (or "shell") to connect to the SSH session
`Ctrl-C` (or "quit") to abort (remote session remains active)
or "Shutdown" to shut down remote interactive session

shutdown

Shutting down interactive session
Remote session shutdown
Goodbye

You can also use the CLI to shut down a specific clearml interactive session:

clearml-session shutdown --id <session_id>

🔗 Connecting to a running interactive session from a different machine

Continue working on an interactive session from any machine. In the clearml web UI, go to the DevOps project, and find your interactive session. Click on the ID button next to the Task name, and copy the unique ID.

clearml-session --attach <session_id>

Click on the JupyterLab/VSCode link, or connect directly to the SSH session.

TIP: You can work & debug your colleagues code and workspace by sharing the session id and connect to the same remote container together with --attach

📂 Store and synchronize interactive session workspace

Specify the remote workspace root-folder by adding --store-workspace ~/workspace to the command line. In the remote session container, put all your code / data under the ~/workspace directory. When your session is shut down, the workspace folder will be automatically packaged and stored on the clearml file server. In your next clearml-session execution, specify again --store-workspace ~/workspace and clearml-session will grab the previous workspace snapshot and restore it into the new remote container in ~/workspace.

clearml-session --store-workspace ~/workspace --docker python:3.10-bullseye

To continue the last aborted session and restore the workspace:

clearml-session --store-workspace ~/workspace --docker python:3.10-bullseye
clearml-session - CLI for launching JupyterLab / VSCode / SSH on a remote machine
Verifying credentials
Use previous queue (resource) '1xGPU' [Y]/n? 

Interactive session config:
...
Restore workspace from session id=01bf86f038314434878b2413343ba746 'interactive_session' @ 2024-03-02 20:34:03 [Y]/n? 
Restoring workspace from previous session id=01bf86f038314434878b2413343ba746

To continue a specific session ID and restore its workspace:

clearml-session --continue-session <session_id> --store-workspace ~/workspace --docker python:3.10-bullseye

📤 Upload local files to remote session

If you need to upload files from your local machine into the remote session, specify the file or directory with --upload-files /mnt/data/stuff. The entire content of the directory / file will be copied into your remote clearml-session container under the ~/session-files/ directory.

Can be used in conjunction with --store-workspace to easily move workloads between local development machines and remote machines with 100% persistent workspace synchronization.

clearml-session --upload-files /mnt/data/stuff

🔧 Debug a previously executed experiment

If you have a previously executed experiment (Task) on the clearml platform, you can create an exact copy of the experiment (Task) and debug it on the remote interactive session. clearml-session will replicate the exact remote environment, add JupyterLab/VSCode/SSH and allow you interactively execute and debug the experiment, on the interactive remote container.

In the clearml web UI, find the experiment (Task) you wish to debug. Click on the ID button next to the Task name, and copy the unique ID, then execute:

clearml-session --debugging-session <experiment_id_here>

Click on the JupyterLab/VSCode link, or drop directly into an SSH shell by typing shell.

❓ Frequently Asked Questions

How does it work?

The clearml-session creates a new interactive Task in the system (default project: DevOps).

This Task is responsible for setting the SSH and JupyterLab/VSCode on the host machine.

The local clearml-session awaits for the interactive Task to finish with the initial setup, then it connects via SSH to the host machine (see "safe and stable" above), and tunnels both SSH and JupyterLab over the SSH connection.

The end result is a local link which you can use to access the JupyterLab/VSCode on the remote machine, over a secure and encrypted connection!

Does clearml-session support Kubernetes clusters?

Yes! clearml-session utilizes the clearml-agent kubernetes glue together with routing capabilities in order to allow any clearml-session to spin a container (pod) on the kubernetes cluster and securely connect directly into the pod. This feature does not require any kubernetes access from the users, and simplifies code development on kubernetes clusters as well as job scheduling & launching. Read more on how to deploy clearml on kubernetes here.

How can I use clearml-session to scale up / out development resources?

Clearml has a cloud autoscaler, so you can easily and automatically spin machines for development!

There is also a default docker image to use when initiating a task.

This means that using clearml-sessions with the autoscaler enabled, allows for turn-key secure development environment inside a docker of your choosing.

Learn more about it here & here.

Does clearml-session fit Work-From-Home setup?

YES. Install clearml-agent on target machines inside the organization, connect over your company VPN and use clearml-session to gain access to a dedicated on-prem machine with the docker of your choosing (with out-of-the-box support for any internal docker artifactory).

Learn more about how to utilize your office workstations and on-prem machines here.

⌨️ CLI options

clearml-session --help
clearml-session - CLI for launching JupyterLab / VSCode / SSH on a remote machine
usage: clearml-session [-h] [--version] [--attach [ATTACH]] [--shutdown [SHUTDOWN]] [--shell]
                       [--debugging-session DEBUGGING_SESSION] [--queue QUEUE] [--docker DOCKER]
                       [--docker-args DOCKER_ARGS] [--public-ip [true/false]] [--remote-ssh-port REMOTE_SSH_PORT]
                       [--vscode-server [true/false]] [--vscode-version VSCODE_VERSION]
                       [--vscode-extensions VSCODE_EXTENSIONS] [--jupyter-lab [true/false]]
                       [--upload-files UPLOAD_FILES] [--continue-session CONTINUE_SESSION]
                       [--store-workspace STORE_WORKSPACE] [--git-credentials [true/false]]
                       [--user-folder USER_FOLDER] [--packages [PACKAGES [PACKAGES ...]]]
                       [--requirements REQUIREMENTS] [--init-script [INIT_SCRIPT]] [--config-file CONFIG_FILE]
                       [--remote-gateway [REMOTE_GATEWAY]] [--base-task-id BASE_TASK_ID] [--project PROJECT]
                       [--session-name SESSION_NAME] [--session-tags [SESSION_TAGS [SESSION_TAGS ...]]]
                       [--disable-session-cleanup [true/false]] [--keepalive [true/false]]
                       [--queue-excluded-tag [QUEUE_EXCLUDED_TAG [QUEUE_EXCLUDED_TAG ...]]]
                       [--queue-include-tag [QUEUE_INCLUDE_TAG [QUEUE_INCLUDE_TAG ...]]]
                       [--skip-docker-network [true/false]] [--password PASSWORD] [--username USERNAME]
                       [--force-dropbear [true/false]] [--verbose] [--yes]
                       {list,info,shutdown} ...

clearml-session - CLI for launching JupyterLab / VSCode / SSH on a remote machine

positional arguments:
  {list,info,shutdown}  ClearML session control commands
    list                List running Sessions
    info                Detailed information on specific session
    shutdown            Shutdown specific session

optional arguments:
  -h, --help            show this help message and exit
  --version             Display the clearml-session utility version
  --attach [ATTACH]     Attach to running interactive session (default: previous session)
  --shutdown [SHUTDOWN], -S [SHUTDOWN]
                        Shut down an active session (default: previous session)
  --shell               Open the SSH shell session directly, notice quitting the SSH session will Not shut down the
                        remote session
  --debugging-session DEBUGGING_SESSION
                        Pass existing Task id (experiment), create a copy of the experiment on a remote machine,
                        and launch jupyter/ssh for interactive access. Example --debugging-session <task_id>
  --queue QUEUE         Select the queue to launch the interactive session on (default: previously used queue)
  --docker DOCKER       Select the docker image to use in the interactive session (default: previously used
                        docker image or `nvidia/cuda:11.6.2-runtime-ubuntu20.04`)
  --docker-args DOCKER_ARGS
                        Add additional arguments for the docker image to use in the interactive session on
                        (default: previously used docker-args)
  --public-ip [true/false]
                        If True, register the public IP of the remote machine. Set if running on the cloud.
                        Default: false (use for local / on-premises)
  --remote-ssh-port REMOTE_SSH_PORT
                        Set the remote ssh server port, running on the agent`s machine. (default: 10022)
  --vscode-server [true/false]
                        Install vscode server (code-server) on interactive session (default: true)
  --vscode-version VSCODE_VERSION
                        Set vscode server (code-server) version, as well as vscode python extension version
                        <vscode:python-ext> (example: "3.7.4:2020.10.332292344")
  --vscode-extensions VSCODE_EXTENSIONS
                        Install additional vscode extensions, as well as vscode python extension (example: "ms-
                        python.python,ms-python.black-formatter,ms-python.pylint,ms-python.flake8")
  --jupyter-lab [true/false]
                        Install Jupyter-Lab on interactive session (default: true)
  --upload-files UPLOAD_FILES
                        Advanced: Upload local files/folders to the remote session. Example: `/my/local/data/`
                        will upload the local folder and extract it into the container in ~/session-files/
  --continue-session CONTINUE_SESSION
                        Continue previous session (ID provided) restoring your workspace (see --store-workspace)
  --store-workspace STORE_WORKSPACE
                        Upload/Restore remote workspace folder. Example: `~/workspace/` will automatically
                        restore/store the *containers* folder and extract it into the next session. Use with
                        --continue-session to continue your previous work from your exact container state
  --git-credentials [true/false]
                        If true, local .git-credentials file is sent to the interactive session. (default: false)
  --user-folder USER_FOLDER
                        Advanced: Set the remote base folder (default: ~/)
  --packages [PACKAGES [PACKAGES ...]]
                        Additional packages to add, supports version numbers (default: previously added packages).
                        examples: --packages torch==1.7 tqdm
  --requirements REQUIREMENTS
                        Specify requirements.txt file to install when setting the interactive session.
                        Requirements file is read and stored in `packages` section as default for the next
                        sessions. Can be overridden by calling `--packages`
  --init-script [INIT_SCRIPT]
                        Specify BASH init script file to be executed when setting the interactive session. Script
                        content is read and stored as default script for the next sessions. To clear the init-
                        script do not pass a file
  --config-file CONFIG_FILE
                        Advanced: Change the configuration file used to store the previous state (default:
                        ~/.clearml_session.json)
  --remote-gateway [REMOTE_GATEWAY]
                        Advanced: Specify gateway ip/address:port to be passed to interactive session (for use
                        with k8s ingestion / ELB)
  --base-task-id BASE_TASK_ID
                        Advanced: Set the base task ID for the interactive session. (default: previously used
                        Task). Use `none` for the default interactive session
  --project PROJECT     Advanced: Set the project name for the interactive session Task
  --session-name SESSION_NAME
                        Advanced: Set the name of the interactive session Task
  --session-tags [SESSION_TAGS [SESSION_TAGS ...]]
                        Advanced: Add tags to the interactive session for increased visibility
  --disable-session-cleanup [true/false]
                        Advanced: If set, previous interactive sessions are not deleted
  --keepalive [true/false]
                        Advanced: If set, enables the transparent proxy always keeping the sockets alive. Default:
                        False, do not use transparent sockets for mitigating connection drops.
  --queue-excluded-tag [QUEUE_EXCLUDED_TAG [QUEUE_EXCLUDED_TAG ...]]
                        Advanced: Excluded queues with this specific tag from the selection
  --queue-include-tag [QUEUE_INCLUDE_TAG [QUEUE_INCLUDE_TAG ...]]
                        Advanced: Only include queues with this specific tag from the selection
  --skip-docker-network [true/false]
                        Advanced: If set, `--network host` is **not** passed to docker (assumes k8s network
                        ingestion) (default: false)
  --password PASSWORD   Advanced: Select ssh password for the interactive session (default: `randomly-generated`
                        or previously used one)
  --username USERNAME   Advanced: Select ssh username for the interactive session (default: `root` or previously
                        used one)
  --randomize           Advanced: Recreate a new random ssh password for the interactive session options: 
                        `--randomize` one time recreate, --randomize `always` create a new random password for 
                        every session                        
  --force-dropbear [true/false]
                        Force using `dropbear` instead of SSHd
  --disable-store-defaults
                        If set, do not store current setup as new default configuration
  --disable-fingerprint-check
                        Advanced: If set, ignore the remote SSH server fingerprint check
  --verbose             Advanced: If set, print verbose progress information, e.g. the remote machine setup
                        process log
  --yes, -y             Automatic yes to prompts; assume "yes" as answer to all prompts and run non-interactively

Notice! all arguments are stored as new defaults for the next execution