Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 38 additions & 121 deletions doc/source/cluster/kubernetes/user-guides/kuberay-auth.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,16 @@
(kuberay-auth)=

# Configure Ray clusters with authentication and access control using KubeRay
# Configure Ray clusters to use token authentication

This guide demonstrates how to secure Ray clusters deployed with KubeRay by enabling authentication and access control using Kubernetes Role-Based Access Control (RBAC).

> **Note:** This guide is only supported for the RayCluster custom resource.
This guide demonstrates how to enable Ray token authentication with KubeRay.

## Prerequisites

* A Kubernetes cluster. This guide uses GKE, but the concepts apply to other Kubernetes distributions.
* `kubectl` installed and configured to interact with your cluster.
* `gcloud` CLI installed and configured, if using GKE.
* [Helm](https://helm.sh/) installed.
* Ray installed locally.
* Ray 2.52.0 or newer.

## Create or use an existing GKE Cluster

Expand All @@ -27,129 +25,65 @@ gcloud container clusters create kuberay-cluster \

Follow [Deploy a KubeRay operator](kuberay-operator-deploy) to install the latest stable KubeRay operator from the Helm repository.

## Deploy a Ray cluster with authentication enabled

Deploy a RayCluster configured with `kube-rbac-proxy` for authentication and authorization:
## Deploy a Ray cluster with token authentication

If you are using KubeRay v1.5.1 or newer, you can use the `authOptions` API in RayCluster to enable token authentication:
```bash
kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/refs/heads/master/ray-operator/config/samples/ray-cluster.auth.yaml
```

This command deploys:
* A `RayCluster` resource with a `kube-rbac-proxy` sidecar container on the Head Pod. This proxy handles authentication and authorization.
* A `ConfigMap` for kube-rbac-proxy, containing resource attributes required for authorization.
* A `ServiceAccount`, `ClusterRole`, and `ClusterRoleBinding` that allow the `kube-rbac-proxy` to access the Kubernetes TokenReview and SubjectAccessReview APIs.

## Verify initial unauthorized access
When enabled, the KubeRay operator will:
* Create a Kubernetes Secret containing a randomly generated token.
* Automatically set the `RAY_AUTH_TOKEN` and `RAY_AUTH_MODE` environment variables on all Ray containers.

Attempt to submit a Ray job to the cluster to verify that authentication is required. You should receive a `401 Unauthorized` error:
If you are using a KubeRay version older than v1.5.1, you can enable token authentication by creating a Kubernetes Secret containing
your token and configuring the `RAY_AUTH_MODE` and `RAY_AUTH_TOKEN` environment variables.

```bash
kubectl port-forward svc/ray-cluster-with-auth-head-svc 8265:8265 &
ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
kubectl create secret generic ray-cluster-with-auth --from-literal=auth_token=$(openssl rand -base64 32)
kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/refs/heads/master/ray-operator/config/samples/ray-cluster.auth-manual.yaml
```

You may see an error similar to this:

```
...
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: http://localhost:8265/api/version
```

This error confirms that the Ray cluster requires authentication.

## Configure Kubernetes RBAC for access control

To access the RayCluster, you need:
* **Authentication:** Provide a valid authentication token (e.g., a Kubernetes service account token or a cloud IAM token) in the request headers.
* **Authorization:** Your authenticated user or service account must have the necessary Kubernetes RBAC permissions to access the `RayCluster` resource.

This guide demonstrates granting access using a Kubernetes service account, but the same principles apply to individual Kubernetes users or cloud IAM users.

### Create a Kubernetes service account

Create a service account that represents your Ray job submitter:

```bash
kubectl create serviceaccount ray-user
```
## Verify initial unauthenticated access

Confirm that the service account currently can't access the `RayCluster` resource:
Attempt to submit a Ray job to the cluster to verify that authentication is required. You should receive a `401 Unauthorized` error:

```bash
kubectl auth can-i get rayclusters.ray.io/ray-cluster-with-auth --as=system:serviceaccount:default:ray-user
```

The output should be `no`.

### Grant access using Kubernetes RBAC

Create a `Role` and `RoleBinding` to grant the necessary permissions to the `ray-user` service account:

```yaml
# ray-cluster-rbac.yaml
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ray-user
namespace: default
rules:
- apiGroups: ["ray.io"]
resources:
- 'rayclusters'
verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ray-user
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ray-user
subjects:
- kind: ServiceAccount
name: ray-user
namespace: default
kubectl port-forward svc/ray-cluster-with-auth-head-svc 8265:8265 &
ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
```

Apply the RBAC configuration:
You should see an error similar to this:

```bash
kubectl apply -f ray-cluster-rbac.yaml
```

### Verify access
RuntimeError: Authentication required: Unauthorized: Missing authentication token

Confirm that the service account now has access to the `RayCluster` resource:
The Ray cluster requires authentication, but no token was provided.

```bash
kubectl auth can-i get rayclusters.ray.io/ray-cluster-with-auth --as=system:serviceaccount:default:ray-user
Please provide an authentication token using one of these methods:
1. Set the `RAY_AUTH_TOKEN` environment variable.
2. Set the `RAY_AUTH_TOKEN_PATH` environment variable (pointing to a file containing the token).
3. Create a token file at the default location: `~/.ray/auth_token`.
```

The output should be `yes`.

## Submit a Ray job with authentication

Now you can submit a Ray job using the service account's authentication token.

Get a token for the `ray-user` service account and store it in the `RAY_JOB_HEADERS` environment variable:
This error confirms that the Ray cluster requires authentication.

```bash
export RAY_JOB_HEADERS="{\"Authorization\": \"Bearer $(kubectl create token ray-user --duration=1h)\"}"
```
## Accessing your Ray cluster with the Ray CLI

> **Note:** `kubectl create token` command is only available on Kubernetes v1.24+
To access your Ray cluster using the Ray CLI, you need to configure the following environment variables:
* `RAY_AUTH_MODE`: this configures the Ray CLI to set the necessary authorization headers for token authentication
* `RAY_AUTH_TOKEN`: this contains the token that will be used for authentication.
* `RAY_AUTH_TOKEN_PATH`: if `RAY_AUTH_TOKEN` is not set, the Ray CLI will instead read the token from this path (defaults to `~/.ray/auth_token`).

Submit the Ray job:
Submit a job with an authenticated Ray CLI:

```bash
export RAY_AUTH_MODE=token
export RAY_AUTH_TOKEN=$(kubectl get secrets ray-cluster-with-auth --template={{.data.auth_token}} | base64 -d)
ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
```

The job should now succeed, and you should see output similar to this:
The job should now succeed and you should see output similar to this:

```bash
Job submission server address: http://localhost:8265
Expand All @@ -176,32 +110,15 @@ Job 'raysubmit_...' succeeded
------------------------------------------
```

## Verify access using cloud IAM (Optional)

Most cloud providers allow you to authenticate to the Kubernetes cluster as your cloud IAM user. This method is a convenient way to interact with the cluster without managing separate Kubernetes credentials.

**Example using Google Cloud (GKE):**

Get an access token for your Google Cloud user:

```bash
export RAY_JOB_HEADERS="{\"Authorization\": \"Bearer $(gcloud auth print-access-token)\"}"
```

Submit a Ray job using the IAM token:
## Viewing the Ray dashboard (optional)
To view the Ray dashboard from your browser, first port forward to from your local machine to the cluster:

```bash
ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
kubectl port-forward svc/ray-cluster-with-auth-head-svc 8265:8265 &
```

The job should succeed if your cloud user has the necessary Kubernetes RBAC permissions. You may need to configure additional RBAC rules for your cloud user.

## View the Ray dashboard (optional)

To view the Ray dashboard from your browser, first configure port-forwarding:
Then open `localhost:8265` in your browser. You will be prompted to provide the auth token for the cluster, which can be retrieved with:

```bash
kubectl port-forward svc/ray-cluster-with-auth-head-svc 8265:8265 &
kubectl get secrets ray-cluster-with-auth --template={{.data.auth_token}} | base64 -d
```

Use a Chrome extension like [Requestly](https://requestly.com/) to automatically add authorization headers to requests for the dashboard endpoint `http://localhost:8265`. The authorization header format is: `Authorization: Bearer <token>`.