Deploy AzureML extension to your Kubernetes cluster

With a simple AzureML extension deployment, an AKS cluster or any Azure Arc enabled Kubernetes cluster, you can instantly onboard data science professionals to submit ML workload to the Kubernetes clusters by using existing Azure ML tools and service capabilities. You can deploy AzureML extension and enables Kubernetes cluster for following machine learning needs:

Deploy AzureML extension for model training and batch inference
Deploy AzureML extension for real-time inferencing only
Deploy AzureML extension for both model training and inferencing

For a quick try and basic test scenario, please follow installation instructions in Happy path.

Upon AzureML extension deployment completes, it will create following resources in Azure cloud:

Resource name	Resource type	Description
Azure Service Bus	Azure resource	Used to sync nodes and cluster resource information to Azure Machine Learning services regularly.
Azure Relay	Azure resource	Route traffic between Azure Machine Learning services and the Kubernetes cluster.

Upon AzureML extension deployment completes, it will create following resources in Kubernetes cluster, depending on each AzureML extension deployment scenario:

Resource name	Resource type	Training	Inference	Training and Inference	Description	Communication with cloud service
relayserver	Kubernetes deployment	✓	✓	✓	The entry component to receive and sync the message with cloud.	Receive the request of job creation, model deployment from cloud service; sync the job status with cloud service.
gateway	Kubernetes deployment	✓	✓	✓	The gateway to communicate and send data back and forth.	Send nodes and cluster resource information to cloud services.
aml-operator	Kubernetes deployment	✓	N/A	✓	Manage the lifecycle of training jobs.	Token exchange with cloud token service for authentication and authorization of Azure Container Registry used by training job.
metrics-controller-manager	Kubernetes deployment	✓	✓	✓	Manage the configuration for Prometheus	N/A
{EXTENSION-NAME}-kube-state-metrics	Kubernetes deployment	✓	✓	✓	Export the cluster-related metrics to Prometheus.	N/A
{EXTENSION-NAME}-prometheus-operator	Kubernetes deployment	✓	✓	✓	Provide Kubernetes native deployment and management of Prometheus and related monitoring components.	N/A
amlarc-identity-controller	Kubernetes deployment	N/A	✓	✓	Request and renew Azure Blob/Azure Container Registry token through managed identity.	Token exchange with cloud token service for authentication and authorization of Azure Contianer Registry and Azure Blob used by inference/model deployment.
amlarc-identity-proxy	Kubernetes deployment	N/A	✓	✓	Request and renew Azure Blob/Azure Container Registry token through managed identity.	Token exchange with cloud token service for authentication and authorization of Azure Contianer Registry and Azure Blob used by inference/model deployment.
azureml-fe	Kubernetes deployment	N/A	✓	✓	The front-end component that routes incoming inference requests to deployed services.	azureml-fe service logs are sent to Azure Blob.
inference-operator-controller-manager	Kubernetes deployment	N/A	✓	✓	Manage the lifecycle of inference endpoints.	N/A
cluster-status-reporter	Kubernetes deployment	✓	✓	✓	Gather the cluster information, like cpu/gpu/memory usage, cluster healthness.	N/A
nfd-master	Kubernetes deployment	✓	N/A	✓	Node Feature Discovery is a Kubernetes add-on.	N/A
nfd-worker	Kubernetes daemonset	✓	N/A	✓	Node Feature Discovery is a Kubernetes add-on.	N/A
csi-blob-controller	Kubernetes deployment	✓	N/A	✓	Azure Blob Storage Container Storage Interface(CSI) driver.	N/A
csi-blob-node	Kubernetes daemonset	✓	N/A	✓	Azure Blob Storage Container Storage Interface(CSI) driver.	N/A
fluent-bit	Kubernetes daemonset	✓	✓	✓	Gather the components' system log.	Upload the components' system log to cloud.
k8s-host-device-plugin-daemonset	Kubernetes daemonset	✓	✓	✓	Expose fuse to pods on each node.	N/A
prometheus-prom-prometheus	Kubernetes statefulset	✓	✓	✓	Gather and send job metrics to cloud.	Send job metrics like cpu/gpu/memory uitilization to cloud.
volcano-admission	Kubernetes deployment	✓	N/A	✓	Volcano admission webhook.	N/A
volcano-controllers	Kubernetes deployment	✓	N/A	✓	Manage the lifecycle of Azure Machine Learning training job pods.	N/A
volcano-scheduler	Kubernetes deployment	✓	N/A	✓	Used to do in cluster job scheduling.	N/A
alertmanager	Kubernetes statefulset	✓	N/A	✓	Handle alerts sent by client applications such as the Prometheus server.	N/A

Important:

Azure ServiceBus and Azure Relay resources are under the same resource group as the Arc cluster resource. These resources are used to communicate with the Kubernetes cluster and modifying them will break attached compute targets.

By default, the deployed kubernetes deployment resourses are randomly deployed to 1 or more nodes of the cluster, and daemonset resource are deployed to ALL nodes. If you want to restrict the extension deployment to specific nodes, please use nodeSelector configuration setting described as below.

Notes:

{EXTENSION-NAME}: This is the extension name specified with az k8s-extension create --name CLI command.

Review AzureML deployment configuration settings

Use k8s-extension create CLI command to deploy AzureML extension, review list of required and optional parameters for k8s-extension create CLI command here. For AzureML extension deployment configurations, use --config or --config-protected to specify list of key=value pairs. Following is the list of configuration settings available to be used for different AzureML extension deployment scenario ns.

Configuration Setting Key Name	Description	Training	Inference	Training and Inference
`enableTraining`	`True` or `False`, default `False`. Must be set to `True` for AzureML extension deployment with Machine Learning model training support.	✓	N/A	✓
`enableInference`	`True` or `False`, default `False`. Must be set to `True` for AzureML extension deployment with Machine Learning inference support.	N/A	✓	✓
`allowInsecureConnections`	`True` or `False`, default False. This must be set to `True` for AzureML extension deployment with HTTP endpoints support for inference, when `sslCertPemFile` and `sslKeyPemFile` are not provided.	N/A	Optional	Optional
`inferenceRouterServiceType`	`loadBalancer` or `nodePort`. Must be set for `enableInference=true`.	N/A	✓	✓
`internalLoadBalancerProvider`	This config is only applicable for Azure Kubernetes Service(AKS) cluster now. Must be set to `azure` to allow the inference router use internal load balancer.	N/A	Optional	Optional
`sslSecret`	The Kubernetes secret name under azureml namespace to store `cert.pem` (PEM-encoded SSL cert) and `key.pem` (PEM-encoded SSL key), required for AzureML extension deployment with HTTPS endpoint support for inference, when `allowInsecureConnections` is set to False. Use this config or give static cert and key file path in configuration protected settings. See sample secret yaml file.	N/A	Optional	Optional
`sslCname`	A SSL CName to use if enabling SSL validation on the cluster.	N/A	N/A	required when using HTTPS endpoint
`inferenceLoadBalancerHA`	`True` or `False`, default `True`. By default, AzureML extension will deploy 3 ingress controller replicas for high availability, which requires at least 3 workers in a cluster. Set this to `False` if you have fewer than 3 workers and want to deploy AzureML extension for development and testing only, in this case it will deploy one ingress controller replica only.	N/A	Optional	Optional
`openshift`	`True` or `False`, default `False`. Set to `True` if you deploy AzureML extension on ARO or OCP cluster. The deployment process will automatically compile a policy package and load policy package on each node so AzureML services operation can function properly.	Optional	Optional	Optional
`nodeSelector`	Set the node selector so the extension components and the training/inference workloads will only be deployed to the nodes with all specified selectors. Usage: `nodeSelector.key=value`, support multiple selectors. Example: `nodeSelector.node-purpose=worker nodeSelector.node-region=eastus`	Optional	Optional	Optional
`installNvidiaDevicePlugin`	`True` or `False`, default `False`. Nvidia Device Plugin is required for ML workloads on Nvidia GPU hardware. By default, AzureML extension deployment will not install Nvidia Device Plugin regardless Kubernetes cluster has GPU hardware or not. User can specify this configuration setting to `True`, so the extension will install Nvidia Device Plugin, but make sure to have Prerequesities ready beforehand.	Optional	Optional	Optional
`blobCsiDriverEnabled`	`True` or `False`, default `True`. Blob CSI driver is required for ML workloads. User can specify this configuration setting to `False` if it was installed already.	Optional	Optional	Optional
`reuseExistingPromOp`	`True` or `False`, default `False`. AzureML extension needs prometheus operator to manage prometheus. Set to `True` to reuse existing prometheus operator. Compatible kube-prometheus-stack helm chart versions are from 9.3.4 to 30.0.1.	Optional	Optional	Optional
`volcanoScheduler.enable`	`True` or `False`, default `True`. AzureML extension needs volcano scheduler to schedule the job. Set to `False` to reuse existing volcano scheduler. Supported volcano scheduler versions are 1.4, 1.5.	Optional	N/A	Optional
`logAnalyticsWS`	`True` or `False`, default `False`. AzureML extension integrates with Azure LogAnalytics Workspace to provide log viewing and analysis capability through LogAalytics Workspace. This setting must be explicitly set to `True` if customer wants to use this capability. LogAnalytics Workspace cost may apply.	N/A	Optional	Optional

Configuration Protected Setting Key Name	Description	Training	Inference	Training and Inference
`sslCertPemFile`, `sslKeyPemFile`	Path to SSL certificate and key file (PEM-encoded), required for AzureML extension deployment with HTTPS endpoint support for inference, when `allowInsecureConnections` is set to False.	N/A	Optional	Optional

Prerequisites for ARO and OCP

For AzureML extension deployment on ARO or OCP cluster, grant privileged access to AzureML service accounts, run oc edit scc privileged command, and add following service accounts under "users:":
- system:serviceaccount:azure-arc:azure-arc-kube-aad-proxy-sa
- system:serviceaccount:azureml:{EXTENSION-NAME}-kube-state-metrics
- system:serviceaccount:azureml:cluster-status-reporter
- system:serviceaccount:azureml:prom-admission
- system:serviceaccount:azureml:default
- system:serviceaccount:azureml:prom-operator
- system:serviceaccount:azureml:csi-blob-node-sa
- system:serviceaccount:azureml:csi-blob-controller-sa
- system:serviceaccount:azureml:load-amlarc-selinux-policy-sa
- system:serviceaccount:azureml:azureml-fe
- system:serviceaccount:azureml:prom-prometheus
- system:serviceaccount:{KUBERNETES-COMPUTE-NAMESPACE}:default
- system:serviceaccount:azureml:azureml-ingress-nginx
- system:serviceaccount:azureml:azureml-ingress-nginx-admission
Notes
- {EXTENSION-NAME}: is the extension name specified with az k8s-extension create --name CLI command.
- {KUBERNETES-COMPUTE-NAMESPACE}: is the namespace of kubernetes compute specified with az ml compute attach --namespace CLI command. Skip configuring 'system:serviceaccount:{KUBERNETES-COMPUTE-NAMESPACE}:default' if no namespace specified with az ml compute attach CLI command.

Prerequisites for Azure Kubernetes Services (AKS) - No Azure Arc connection

If you use Azure Kubernetes Services(AKS) cluster and it's not connected to Azure Arc, please register below feature for your subscription.
```
az feature register --namespace Microsoft.ContainerService -n AKS-ExtensionManager
```

Deploy AzureML extension for model training or batch inference workload

Following CLI command will deploy AzureML extension and enable Kubernetes cluster for model training and batch inference workload:

az k8s-extension create --name arcml-extension --extension-type Microsoft.AzureML.Kubernetes --config enableTraining=True  --cluster-type connectedClusters --cluster-name <your-connected-cluster-name> --resource-group <resource-group> --scope cluster --auto-upgrade-minor-version False

Notes:

If you deploy AzureML extension on AKS directly without Azure Arc connection, please change --cluster-type parameter value to managedClusters

Deploy AzureML extension for real-time inference workload only

Depending on your network setup, Kubernetes distribution variant, and where your Kubernetes cluster is hosted (in cloud or on-premises), you can choose one of following options to deploy AzureML extension.

Notes:

If you deploy AzureML extension on AKS directly without Azure Arc connection, please change --cluster-type parameter value to managedClusters

Public HTTPS endpoints support with public load balancer

az k8s-extension create --name arcml-extension --extension-type Microsoft.AzureML.Kubernetes --cluster-type connectedClusters --cluster-name <your-connected-cluster-name> --config enableInference=True inferenceRouterServiceType=loadBalancer sslCname=<cname> --config-protected sslCertPemFile=<path-to-the-SSL-cert-PEM-file> sslKeyPemFile=<path-to-the-SSL-key-PEM-file> --resource-group <resource-group> --scope cluster --auto-upgrade-minor-version False

Private HTTPS endpoints support with internal load balancer

az k8s-extension create --name amlarc-compute --extension-type Microsoft.AzureML.Kubernetes --cluster-type connectedClusters --cluster-name <your-connected-cluster-name> --config enableInference=True inferenceRouterServiceType=loadBalancer internalLoadBalancerProvider=azure sslCname=<cname> --config-protected sslCertPemFile=<path-to-the-SSL-cert-PEM-file> sslKeyPemFile=<path-to-the-SSL-key-PEM-file> --resource-group <resource-group> --scope cluster --auto-upgrade-minor-version False

HTTPS endpoints support with NodePort
```
az k8s-extension create --name arcml-extension --extension-type Microsoft.AzureML.Kubernetes --cluster-type connectedClusters --cluster-name <your-connected-cluster-name> --resource-group <resource-group> --scope cluster --config enableInference=True inferenceRouterServiceType=nodePort sslCname=<cname> --config-protected sslCertPemFile=<path-to-the-SSL-cert-PEM-file> sslKeyPemFile=<path-to-the-SSL-key-PEM-file> --auto-upgrade-minor-version False
```
Note:
- Using a NodePort gives you the freedom to set up your own load-balancing solution, to configure environments that are not fully supported by Kubernetes, or even to expose one or more nodes' IPs directly.
- When you deploy with NodePort service, the scoring url (or swagger url) will be responded with one of Node IP (for example, https://<NodeIP>:<NodePort>/<scoring_path>) and remain unchanged even if the Node is unavailable. But you can replace it with any other Node IP.

Private HTTP endpoints support with internal load balancer

az k8s-extension create --name arcml-extension --extension-type Microsoft.AzureML.Kubernetes --cluster-type connectedClusters --cluster-name <your-connected-cluster-name> --config enableInference=True inferenceRouterServiceType=loadBalancer internalLoadBalancerProvider=azure allowInsecureConnections=True --resource-group <resource-group> --scope cluster --auto-upgrade-minor-version False

HTTP endpoints support with NodePort

az k8s-extension create --name arcml-extension --extension-type Microsoft.AzureML.Kubernetes --cluster-type connectedClusters --cluster-name <your-connected-cluster-name> --config enableInference=True inferenceRouterServiceType=nodePort allowInsecureConnections=Ture --resource-group <resource-group> --scope cluster --auto-upgrade-minor-version False

Public HTTP endpoints support with public load balancer - the least secure way, NOT recommended

az k8s-extension create --name arcml-extension --extension-type Microsoft.AzureML.Kubernetes --cluster-type connectedClusters --cluster-name <your-connected-cluster-name>  --config enableInference=True inferenceRouterServiceType=loadBalancer allowInsecureConnections=True --resource-group <resource-group> --scope cluster --auto-upgrade-minor-version False

Deploy AzureML extension for training, batch inference, and real-time inference workload

To enable Kubernetes cluster for all kinds of ML workload, choose one of above inference deployment options and append config settings for training and batch inference. Following CLI command will enable cluster with real-time inference HTTPS endpoints support, training, and batch inference workload:

az k8s-extension create --name arcml-extension --extension-type Microsoft.AzureML.Kubernetes --cluster-type connectedClusters --cluster-name <your-connected-cluster-name> --config enableTraining=True enableInference=True inferenceRouterServiceType=loadBalancer--config-protected sslCertPemFile=<path-to-the-SSL-cert-PEM-file> sslKeyPemFile=<path-to-the-SSL-key-PEM-file> --resource-group <resource-group> --scope cluster --auto-upgrade-minor-version False

Notes:

If you deploy AzureML extension on AKS directly without Azure Arc connection, please change --cluster-type parameter value to managedClusters

Verify your AzureML extension deployment

Run the following CLI command to check AzureML extension details:

az k8s-extension show --name arcml-extension --cluster-type connectedClusters --cluster-name <your-connected-cluster-name> --resource-group <resource-group>

In the response, look for "name": "arcml-extension" and "provisioningState": "Succeeded". Note it might show "provisioningState": "Pending" for the first few minutes.
If the provisioningState shows Succeeded, run the following command on your machine with the kubeconfig file pointed to your cluster to check that all pods under "azureml" namespace are in 'Running' state:
```
 kubectl get pods -n azureml
```

Update Azure Machine Learning extension

Use k8s-extension update CLI command to update the mutable properties of AzureML extension, review list of required and optional parameters for k8s-extension update CLI command here.

Azure Arc supports update of --auto-upgrade-minor-version, --version, --config, --config-protected.
For configurationSettings, only the settings that require update need to be provided. If the user provides all settings, they would be merged/overwritten with the provided values.
For ConfigurationProtectedSettings, ALL settings should be provided. If some settings are omitted, those settings would be considered obsolete and deleted.

Important

Don't update following configs if you have active training workloads or real-time inference endpoints. Otherwise, the training jobs will be impacted and endpoints unavailable.

enableTraining from True to False
installNvidiaDevicePlugin from True to False when using GPU.
nodeSelector. The update operation can't remove existing nodeSelectors. It can only update existing ones or add new ones.

Don't update following configs if you have active real-time inference endpoints, otherwise, the endpoints will be unavailable.

allowInsecureConnections
inferenceRouterServiceType
internalLoadBalancerProvider
To update logAnalyticsWS from True to False, provide all original configurationProtectedSettings. Otherwise, those settings are considered obsolete and deleted.

Delete Azure Machine Learning extension

Use k8s-extension delete CLI command to delete the existed AzureMl extension.

It takes around 10 minutes to delete all components deployed to the Kubernetes cluster. Run kubectl get pods -n azureml to check if all components were deleted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deploy-extension.md

deploy-extension.md

Deploy AzureML extension to your Kubernetes cluster

Review AzureML deployment configuration settings

Prerequisites for ARO and OCP

Prerequisites for Azure Kubernetes Services (AKS) - No Azure Arc connection

Deploy AzureML extension for model training or batch inference workload

Deploy AzureML extension for real-time inference workload only

Deploy AzureML extension for training, batch inference, and real-time inference workload

Verify your AzureML extension deployment

Update Azure Machine Learning extension

Delete Azure Machine Learning extension

Files

deploy-extension.md

Latest commit

History

deploy-extension.md

File metadata and controls

Deploy AzureML extension to your Kubernetes cluster

Review AzureML deployment configuration settings

Prerequisites for ARO and OCP

Prerequisites for Azure Kubernetes Services (AKS) - No Azure Arc connection

Deploy AzureML extension for model training or batch inference workload

Deploy AzureML extension for real-time inference workload only

Deploy AzureML extension for training, batch inference, and real-time inference workload

Verify your AzureML extension deployment

Update Azure Machine Learning extension

Delete Azure Machine Learning extension