-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding VMware platforms support such as vSphere to Ray Autoscaler (#3…
…7815) --------- Signed-off-by: Shubham Urkade <surkade@vmware.com> Signed-off-by: Chen Hui <huchen@vmware.com> Signed-off-by: Chen Jing <jingch@vmware.com> Co-authored-by: Chen Hui <huchen@vmware.com> Co-authored-by: Chen Jing <jingch@vmware.com>
- Loading branch information
1 parent
8d98dc6
commit 4428a3f
Showing
16 changed files
with
1,673 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
# Ray on vSphere Architecture Guide | ||
|
||
To support ray on vSphere, the implementation has been added into [python/ray/autoscaler/_private/vsphere](../vsphere) directory. The following sections will explain the vSphere terminologies used in the code and also explain the whole code flow. | ||
|
||
|
||
# vSphere Terminologies | ||
## [OVF file](https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vm_admin.doc/GUID-AE61948B-C2EE-436E-BAFB-3C7209088552.html) | ||
OVF format is a packaging and distribution format for virtual machines. It is a standard which can be used to describe the VM metadata. We use the OVF files to create the [Frozen VM](#frozen-vm) | ||
|
||
## Frozen VM | ||
This is a VM that is kept in a frozen state i.e the clock of the VM is stopped. A VM in such a state can be used to create child VMs very rapidly with [instant clone](#instant-clone) operation. | ||
|
||
The frozen VM itself is created from an OVF file. This OVF file executes a script on start of the VM that puts it into a frozen state. The script has the following sequence of execution at a high level: | ||
|
||
1. Execute `vmware-rpctool "instantclone.freeze"` command --> Puts the VM into the frozen state | ||
2. Reset the network | ||
|
||
The script varies depending upon the Guest OS type. Sample scripts for various OSes can be found at the following github repo: [Instant Clone Customization scripts](https://github.com/lamw/instantclone-community-customization-scripts) | ||
## [Instant Clone](https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vm_admin.doc/GUID-853B1E2B-76CE-4240-A654-3806912820EB.html) | ||
Instant clone feature of the vSphere can be used to quickly create new nodes by cloning from the frozen VM. The new nodes replicate the parent VM and continue execution post `vmware-rpctool "instantclone.freeze"` command i.e the cloned nodes reset their network to get new IP addresses. | ||
|
||
## [Resource Pool](https://docs.vmware.com/en/VMware-vSphere/8.0/vsphere-resource-management/GUID-60077B40-66FF-4625-934A-641703ED7601.html) | ||
Resource Pool is a logical abstraction that can be used to separate a group of VMs from others. It can also be configured to limit the resources that can be consumed by the VMs. | ||
|
||
## [Datastore](https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.storage.doc/GUID-3CC7078E-9C30-402C-B2E1-2542BEE67E8F.html) | ||
|
||
Datastores are logical containers that provide an uniform way to store the artifacts required by VMs. | ||
|
||
## VI Admin | ||
|
||
The term VI stands for [Virtual Infrastructure](https://www.vmware.com/in/topics/glossary/content/virtual-infrastructure.html). | ||
|
||
A VI Admin is used to describe a persona that manages the lifecycle of VMware infrastructure. VI Admins engage in a range of activities. A subset of them are listed below: | ||
1. Provisioning [ESXi](https://www.vmware.com/in/products/esxi-and-esx.html) (Hypervisor developed by VMware) hosts. | ||
2. Provisioning a vSphere infrastructure. | ||
3. Managing lifecycle of VMs. | ||
4. Provisioning [vSAN](https://docs.vmware.com/en/VMware-vSAN/index.html) storage. | ||
|
||
## [vSphere Tags](https://docs.vmware.com/en/VMware-vSphere/8.0/vsphere-vcenter-esxi-management/GUID-16422FF7-235B-4A44-92E2-532F6AED0923.html#:~:text=You%20can%20create%2C%20edit%2C%20and,objects%20in%20the%20vSphere%20inventory) | ||
A tag is a label that can be assigned to objects on the vSphere inventory. A tag needs to be assigned to a tag category. | ||
A category allows to group tags together. | ||
# Code Flow | ||
## Node Creation on `ray up` | ||
The following sections explain the code flow in a sequential manner. The execution is triggered from the moment user executed `ray up` command | ||
### Create Key pairs ([config.py](./config.py)) | ||
Create a key pair (private and public keys) if not already present or use the existing key pair. The private key is injected into `config["auth"]["ssh_private_key"]` The bootstrap machine (where the `ray up` command is executed) and the head node subsequently use this key to SSH onto the ray nodes. | ||
### Update vSphere Configs ([config.py](./config.py)) | ||
Used to make sure that the user has created the YAML file with valid configs. | ||
### Create Nodes ([node_provider.py](./node_provider.py)) | ||
#### Call `create_node` | ||
Starts the creation of nodes with `create_node` function, which internally calls `_create_node`. The nodes are created in parallel. | ||
#### Fetch frozen VM | ||
The frozen VM is setup by the [VI admin](#vi-admin) using an OVF that's provided by VMware. The name of the frozen VM is provided in the YAML file. The code will then fetch it with the provided name by `get_frozen_vm_obj` function. | ||
#### [Cloudinit](https://cloudinit.readthedocs.io/en/latest/index.html) the frozen VM | ||
Cloudinit is industry standard for cloud instance initialization. It can be used to initialize any newly provisioned VMs with networking, storage and SSH keys related configuration. | ||
We Cloudinit the frozen VM with userdata by executing `set_cloudinit_userdata`. This creates a new user on the VM and injects a public key for the user. Uses public key generated from [Create Key pairs](#create-key-pairs) section | ||
#### Instant clone the nodes | ||
All the nodes are instant cloned from the frozen VM. | ||
#### Tag nodes with [vSphere Tags](#vsphere-tags) | ||
The nodes are tagged while their creation is in progress in an async way with `tag_vm` function. | ||
Post creation of the nodes, the tags on the nodes are updated. | ||
|
||
#### Connect [NICs](https://www.oreilly.com/library/view/learning-vmware-vsphere/9781782174158/ch04s04.html) (Network Interface Cards) | ||
The frozen VM has all its NICs in disconnected state. This is done so that the nodes that are cloned from it don't copy the frozem VM's IP address. | ||
Once, the nodes are cloned from the frozen VM, we connect the NICs so that they can start to get new IP addresses. | ||
## Autoscaling | ||
### Get and create nodes ([node_provider.py](./node_provider.py)) | ||
The autoscaler can find the currently running nodes with `non_terminated_nodes` function and can request for new nodes by calling `create_node` function. | ||
### Fetch node IPs ([node_provider.py](./node_provider.py)) | ||
The autoscaler can use `external_ip` or `internal_ip` function to fetch a node's IP. | ||
## Cluster tear down ([node_provider.py](q./node_provider.py)) | ||
`terminate_nodes` function gets called on ray down command's execution. It deletes all the nodes except the frozen VM. |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,198 @@ | ||
import copy | ||
import logging | ||
import os | ||
|
||
from cryptography.hazmat.backends import default_backend as crypto_default_backend | ||
from cryptography.hazmat.primitives import serialization as crypto_serialization | ||
from cryptography.hazmat.primitives.asymmetric import rsa | ||
|
||
from ray.autoscaler._private.event_system import CreateClusterEvent, global_event_system | ||
from ray.autoscaler._private.util import check_legacy_fields | ||
|
||
PRIVATE_KEY_NAME = "ray-bootstrap-key" | ||
PRIVATE_KEY_NAME_EXTN = "{}.pem".format(PRIVATE_KEY_NAME) | ||
|
||
PUBLIC_KEY_NAME = "ray_bootstrap_public_key" | ||
PUBLIC_KEY_NAME_EXTN = "{}.key".format(PUBLIC_KEY_NAME) | ||
|
||
PRIVATE_KEY_PATH = os.path.expanduser("~/{}.pem".format(PRIVATE_KEY_NAME)) | ||
PUBLIC_KEY_PATH = os.path.expanduser("~/{}.key".format(PUBLIC_KEY_NAME)) | ||
|
||
USER_DATA_FILE_PATH = os.path.join(os.path.dirname(__file__), "./data/userdata.yaml") | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
||
def bootstrap_vsphere(config): | ||
# create a copy of the input config to modify | ||
config = copy.deepcopy(config) | ||
|
||
add_credentials_into_provider_section(config) | ||
# Update library item configs | ||
update_vsphere_configs(config) | ||
|
||
# Log warnings if user included deprecated `head_node` or `worker_nodes` | ||
# fields. Raise error if no `available_node_types` | ||
check_legacy_fields(config) | ||
|
||
# Create new key pair if it doesn't exist already | ||
create_key_pair() | ||
|
||
# Configure SSH access, using an existing key pair if possible. | ||
config = configure_key_pair(config) | ||
|
||
global_event_system.execute_callback( | ||
CreateClusterEvent.ssh_keypair_downloaded, | ||
{"ssh_key_path": config["auth"]["ssh_private_key"]}, | ||
) | ||
|
||
return config | ||
|
||
|
||
def add_credentials_into_provider_section(config): | ||
|
||
provider_config = config["provider"] | ||
|
||
# vsphere_config is an optional field as the credentials can also be specified | ||
# as env variables so first check verifies if this field is present before | ||
# accessing its properties | ||
if ( | ||
"vsphere_config" in provider_config | ||
and "credentials" in provider_config["vsphere_config"] | ||
): | ||
return | ||
|
||
env_credentials = { | ||
"server": os.environ["VSPHERE_SERVER"], | ||
"user": os.environ["VSPHERE_USER"], | ||
"password": os.environ["VSPHERE_PASSWORD"], | ||
} | ||
|
||
provider_config["vsphere_config"] = {} | ||
provider_config["vsphere_config"]["credentials"] = env_credentials | ||
|
||
|
||
def update_vsphere_configs(config): | ||
available_node_types = config["available_node_types"] | ||
|
||
# Fetch worker: field from the YAML file | ||
worker_node = available_node_types["worker"] | ||
worker_node_config = worker_node["node_config"] | ||
|
||
# Fetch the head node field name from head_node_type field. | ||
head_node_type = config["head_node_type"] | ||
|
||
# Use head_node_type field's value to fetch the head node field | ||
head_node = available_node_types[head_node_type] | ||
head_node_config = head_node["node_config"] | ||
|
||
# A mandatory constraint enforced by the Ray's YAML validator | ||
# is to add resources field for both head and worker nodes. | ||
# For example, to specify resources for the worker the | ||
# user will specify it in | ||
# worker: | ||
# resources | ||
# We copy that resources field into | ||
# worker: | ||
# node_config: | ||
# resources | ||
# This enables us to access the field during node creation. | ||
# The same happens for head node too. | ||
worker_node_config["resources"] = worker_node["resources"] | ||
head_node_config["resources"] = head_node["resources"] | ||
|
||
head_resource_pool = None | ||
if "resource_pool" in head_node_config: | ||
head_resource_pool = head_node_config["resource_pool"] | ||
|
||
# by default create worker nodes in the head node's resource pool | ||
worker_resource_pool = head_resource_pool | ||
|
||
# If different resource pool is provided for worker nodes, use it | ||
if "resource_pool" in worker_node_config: | ||
worker_resource_pool = worker_node_config["resource_pool"] | ||
|
||
worker_node_config["resource_pool"] = worker_resource_pool | ||
|
||
worker_networks = None | ||
worker_datastore = None | ||
|
||
if "networks" in head_node_config and head_node_config["networks"]: | ||
worker_networks = head_node_config["networks"] | ||
|
||
if "networks" in worker_node_config and worker_node_config["networks"]: | ||
worker_networks = worker_node_config["networks"] | ||
|
||
worker_node_config["networks"] = worker_networks | ||
|
||
if "datastore" in head_node_config and head_node_config["datastore"]: | ||
worker_datastore = head_node_config["datastore"] | ||
|
||
if "datastore" in worker_node_config and worker_node_config["datastore"]: | ||
worker_datastore = worker_node_config["datastore"] | ||
|
||
worker_node_config["datastore"] = worker_datastore | ||
|
||
if "frozen_vm_name" not in head_node_config: | ||
raise ValueError( | ||
"frozen_vm_name is mandatory for bringing up the Ray cluster, contact " | ||
"yourVI admin for the information." | ||
) | ||
|
||
|
||
def create_key_pair(): | ||
|
||
# If the files already exists, we don't want to create new keys. | ||
# This if condition will currently pass even if there are invalid keys | ||
# in those path. TODO: Only return if the keys are valid. | ||
|
||
if os.path.exists(PRIVATE_KEY_PATH) and os.path.exists(PUBLIC_KEY_PATH): | ||
logger.info("Key-pair already exist. Not creating new ones") | ||
return | ||
|
||
# Generate keys | ||
key = rsa.generate_private_key( | ||
backend=crypto_default_backend(), public_exponent=65537, key_size=2048 | ||
) | ||
|
||
private_key = key.private_bytes( | ||
crypto_serialization.Encoding.PEM, | ||
crypto_serialization.PrivateFormat.PKCS8, | ||
crypto_serialization.NoEncryption(), | ||
) | ||
|
||
public_key = key.public_key().public_bytes( | ||
crypto_serialization.Encoding.OpenSSH, crypto_serialization.PublicFormat.OpenSSH | ||
) | ||
|
||
with open(PRIVATE_KEY_PATH, "wb") as content_file: | ||
content_file.write(private_key) | ||
os.chmod(PRIVATE_KEY_PATH, 0o600) | ||
|
||
with open(PUBLIC_KEY_PATH, "wb") as content_file: | ||
content_file.write(public_key) | ||
|
||
|
||
def configure_key_pair(config): | ||
|
||
logger.info("Configure key pairs for copying into the head node.") | ||
|
||
assert os.path.exists( | ||
PRIVATE_KEY_PATH | ||
), "Private key file at path {} was not found".format(PRIVATE_KEY_PATH) | ||
|
||
assert os.path.exists( | ||
PUBLIC_KEY_PATH | ||
), "Public key file at path {} was not found".format(PUBLIC_KEY_PATH) | ||
|
||
# updater.py file uses the following config to ssh onto the head node | ||
# Also, copies the file onto the head node | ||
config["auth"]["ssh_private_key"] = PRIVATE_KEY_PATH | ||
|
||
# The path where the public key should be copied onto the remote host | ||
public_key_remote_path = "~/{}".format(PUBLIC_KEY_NAME_EXTN) | ||
|
||
# Copy the public key to the remote host | ||
config["file_mounts"][public_key_remote_path] = PUBLIC_KEY_PATH | ||
|
||
return config |
Oops, something went wrong.