Skip to content

Commit acf5bc0

Browse files
authored
Edit AI App Gateway instructions
2 parents d2169fc + f0249b7 commit acf5bc0

9 files changed

+160
-121
lines changed

docs/deploying_clearml/enterprise_deploy/appgw.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,12 @@ their instances:
3030
* [Embedding Model Deployment](../../webapp/applications/apps_embed_model_deployment.md)
3131
* [Llama.cpp Model Deployment](../../webapp/applications/apps_llama_deployment.md)
3232

33-
The AI Application Gateway is provided through an additional component to the ClearML Server deployment: The ClearML Task Traffic Router.
34-
If your ClearML Deployment does not have the Task Traffic Router properly installed, these application instances may not be accessible.
33+
The AI Application Gateway requires an additional component to the ClearML Server deployment: the **ClearML App Gateway Router**.
34+
If your ClearML Deployment does not have the App Gateway Router properly installed, these application instances may not be accessible.
3535

3636
#### Installation
3737

38-
The Task Traffic Router supports two deployment options:
38+
The App Gateway Router supports two deployment options:
3939

4040
* [Docker Compose](appgw_install_compose.md)
4141
* [Kubernetes](appgw_install_k8s.md)

docs/deploying_clearml/enterprise_deploy/appgw_install_compose.md

+87-62
Original file line numberDiff line numberDiff line change
@@ -40,77 +40,72 @@ This is an example of the `docker-compose` file you will need:
4040
```
4141
version: '3.5'
4242
services:
43-
task_traffic_webserver:
44-
image: allegroai/task-traffic-router-webserver:${TASK-TRAFFIC-ROUTER-WEBSERVER-TAG}
45-
ports:
46-
- "80:8080"
47-
restart: unless-stopped
48-
container_name: task_traffic_webserver
49-
volumes:
50-
- ./task_traffic_router/config/nginx:/etc/nginx/conf.d:ro
51-
- ./task_traffic_router/config/lua:/usr/local/openresty/nginx/lua:ro
52-
task_traffic_router:
53-
image: allegroai/task-traffic-router:${TASK-TRAFFIC-ROUTER-TAG}
54-
restart: unless-stopped
55-
container_name: task_traffic_router
56-
volumes:
57-
- /var/run/docker.sock:/var/run/docker.sock
58-
- ./task_traffic_router/config/nginx:/etc/nginx/conf.d:rw
59-
- ./task_traffic_router/config/lua:/usr/local/openresty/nginx/lua:rw
60-
environment:
61-
- LOGGER_LEVEL=INFO
62-
- CLEARML_API_HOST=${CLEARML_API_HOST:?err}
63-
- CLEARML_API_ACCESS_KEY=${CLEARML_API_ACCESS_KEY:?err}
64-
- CLEARML_API_SECRET_KEY=${CLEARML_API_SECRET_KEY:?err}
65-
- ROUTER_URL=${ROUTER_URL:?err}
66-
- ROUTER_NAME=${ROUTER_NAME:?err}
67-
- AUTH_ENABLED=${AUTH_ENABLED:?err}
68-
- SSL_VERIFY=${SSL_VERIFY:?err}
69-
- AUTH_COOKIE_NAME=${AUTH_COOKIE_NAME:?err}
70-
- AUTH_BASE64_JWKS_KEY=${AUTH_BASE64_JWKS_KEY:?err}
71-
- LISTEN_QUEUE_NAME=${LISTEN_QUEUE_NAME}
72-
- EXTRA_BASH_COMMAND=${EXTRA_BASH_COMMAND}
73-
- TCP_ROUTER_ADDRESS=${TCP_ROUTER_ADDRESS}
74-
- TCP_PORT_START=${TCP_PORT_START}
75-
- TCP_PORT_END=${TCP_PORT_END}
76-
43+
task_traffic_webserver:
44+
image: clearml/ai-gateway-proxy:${PROXY_TAG:?err}
45+
network_mode: "host"
46+
restart: unless-stopped
47+
container_name: task_traffic_webserver
48+
volumes:
49+
- ./task_traffic_router/config/nginx:/etc/nginx/conf.d:ro
50+
- ./task_traffic_router/config/lua:/usr/local/openresty/nginx/lua:ro
51+
task_traffic_router:
52+
image: clearml/ai-gateway-router:${ROUTER_TAG:?err}
53+
restart: unless-stopped
54+
container_name: task_traffic_router
55+
volumes:
56+
- /var/run/docker.sock:/var/run/docker.sock
57+
- ./task_traffic_router/config/nginx:/etc/nginx/conf.d:rw
58+
- ./task_traffic_router/config/lua:/usr/local/openresty/nginx/lua:rw
59+
environment:
60+
- ROUTER_NAME=${ROUTER_NAME:?err}
61+
- ROUTER__WEBSERVER__SERVER_PORT=${ROUTER__WEBSERVER__SERVER_PORT:?err}
62+
- ROUTER_URL=${ROUTER_URL:?err}
63+
- CLEARML_API_HOST=${CLEARML_API_HOST:?err}
64+
- CLEARML_API_ACCESS_KEY=${CLEARML_API_ACCESS_KEY:?err}
65+
- CLEARML_API_SECRET_KEY=${CLEARML_API_SECRET_KEY:?err}
66+
- AUTH_COOKIE_NAME=${AUTH_COOKIE_NAME:?err}
67+
- AUTH_SECURE_ENABLED=${AUTH_SECURE_ENABLED}
68+
- TCP_ROUTER_ADDRESS=${TCP_ROUTER_ADDRESS}
69+
- TCP_PORT_START=${TCP_PORT_START}
70+
- TCP_PORT_END=${TCP_PORT_END}
7771
```
7872

79-
Create a *runtime.env* file containing the following entries:
73+
Create a `runtime.env` file containing the following entries:
8074

8175
```
82-
TASK-TRAFFIC-ROUTER-WEBSERVER-TAG=
83-
TASK-TRAFFIC-ROUTER-TAG=
84-
CLEARML_API_HOST=https://api.
76+
PROXY_TAG=
77+
ROUTER_TAG=
78+
ROUTER_NAME=main-router
79+
ROUTER__WEBSERVER__SERVER_PORT=8010
80+
ROUTER_URL=
81+
CLEARML_API_HOST=
8582
CLEARML_API_ACCESS_KEY=
8683
CLEARML_API_SECRET_KEY=
87-
ROUTER_URL=
88-
ROUTER_NAME=main-router
89-
AUTH_ENABLED=true
90-
SSL_VERIFY=true
9184
AUTH_COOKIE_NAME=
92-
AUTH_BASE64_JWKS_KEY=
93-
LISTEN_QUEUE_NAME=
94-
EXTRA_BASH_COMMAND=
85+
AUTH_SECURE_ENABLED=true
9586
TCP_ROUTER_ADDRESS=
9687
TCP_PORT_START=
9788
TCP_PORT_END=
9889
```
9990

10091
Edit it according to the following guidelines:
101-
102-
* `CLEARML_API_HOST`: URL usually starting with `https://api.`
103-
* `CLEARML_API_ACCESS_KEY`: ClearML server api key
104-
* `CLEARML_API_SECRET_KEY`: ClearML server secret key
105-
* `ROUTER_URL`: URL for this router that was previously configured in the load balancer starting with `https://`
106-
* `ROUTER_NAME`: Unique name for this router
107-
* `AUTH_ENABLED`: Enable or disable http calls authentication when the router is communicating with the ClearML server
108-
* `SSL_VERIFY`: Enable or disable SSL certificate validation when the router is communicating with the ClearML server
109-
* `AUTH_COOKIE_NAME`: Cookie name used by the ClearML server to store the ClearML authentication cookie. This can usually be found in the `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in the ClearML server installation (`/opt/allegro/config/envoy/envoy.yaml`) (see below)
110-
* `AUTH_SECURE_ENABLED`: Enable the Set-Cookie `secure` parameter
111-
* `AUTH_BASE64_JWKS_KEY`: Value form `k` key in the `jwks.json` file in the ClearML server installation
112-
* `LISTEN_QUEUE_NAME`: (*optional*) Name of queue to check for tasks (if none, every task is checked)
113-
* `EXTRA_BASH_COMMAND`: Command to be launched before starting the router
92+
* `PROXY_TAG`: AI Application Gateway proxy tag. The Docker image tag for the proxy component, which needs to be
93+
specified during installation. This tag is provided by ClearML to ensure compatibility with the recommended version.
94+
* `ROUTER_TAG`: App Gateway Router tag. The Docker image tag for the router component. It defines the specific version
95+
to be installed and is provided by ClearML as part of the setup process.
96+
* `ROUTER_NAME`: In the case of [multiple routers on the same tenant](#multiple-router-in-the-same-tenant), each router
97+
needs to have a unique name.
98+
* `ROUTER__WEBSERVER__SERVER_PORT`: Webserver port. The default port is 8080, but it can be adjusted to meet specific network requirements.
99+
* `ROUTER_URL`: External address to access the router. This can be the IP address or DNS of the node where the router
100+
is running, or the address of a load balancer if the router operates behind a proxy/load balancer. This URL is used
101+
to access AI workload applications (e.g. remote IDE, model deployment, etc.), so it must be reachable and resolvable for them.
102+
* `CLEARML_API_HOST`: ClearML API server URL starting with `https://api.`
103+
* `CLEARML_API_ACCESS_KEY`: ClearML server API key.
104+
* `CLEARML_API_SECRET_KEY`: ClearML server secret key.
105+
* `AUTH_COOKIE_NAME`: Cookie used by the ClearML server to store the ClearML authentication cookie. This can usually be
106+
found in the `envoy.yaml` file in the ClearML server installation (`/opt/allegro/config/envoy/envoy.yaml`), under the
107+
`value_prefix` key starting with `allegro_token`
108+
* `AUTH_SECURE_ENABLED`: Enable the Set-Cookie `secure` parameter. Set to `false` in case services are exposed with `http`.
114109
* `TCP_ROUTER_ADDRESS`: Router external address, can be an IP or the host machine or a load balancer hostname, depends on network configuration
115110
* `TCP_PORT_START`: Start port for the TCP Session feature
116111
* `TCP_PORT_END`: End port for the TCP Session feature
@@ -121,12 +116,42 @@ Run the following command to start the router:
121116
sudo docker compose --env-file runtime.env up -d
122117
```
123118

124-
:::note How to find my jwkskey
119+
### Advanced Configuration
125120

126-
The *JSON Web Key Set* (*JWKS*) is a set of keys containing the public keys used to verify any JSON Web Token (JWT).
121+
#### Using Open HTTP
127122

128-
In a `docker-compose` server installation, this can be found in the `CLEARML__secure__auth__token_secret` env var in the apiserver server component.
123+
To deploy the App Gateway Router on open HTTP (without a certificate), set the `AUTH_SECURE_ENABLED` entry
124+
to `false` in the `runtime.env` file.
129125

130-
:::
126+
#### Multiple Router in the Same Tenant
127+
128+
If you have workloads running in separate networks that cannot communicate with each other, you need to deploy multiple
129+
routers, one for each isolated environment. Each router will only process tasks from designated queues, ensuring that
130+
tasks are correctly routed to agents within the same network.
131+
132+
For example:
133+
* If Agent A and Agent B are in separate networks, each must have its own router to receive tasks.
134+
* Router A will handle tasks from Agent A’s queues. Router B will handle tasks from Agent B’s queues.
135+
136+
To achieve this, each router must be configured with:
137+
* A unique `ROUTER_NAME`
138+
* A distinct set of queues defined in `LISTEN_QUEUE_NAME`.
139+
140+
##### Example Configuration
141+
Each router's `runtime.env` file should include:
142+
143+
* Router A:
144+
145+
```
146+
ROUTER_NAME=router-a
147+
LISTEN_QUEUE_NAME=queue1,queue2
148+
```
131149

150+
* Router B:
132151

152+
```
153+
ROUTER_NAME=router-b
154+
LISTEN_QUEUE_NAME=queue3,queue4
155+
```
156+
157+
Make sure `LISTEN_QUEUE_NAME` is set in the [`docker-compose` environment variables](#docker-compose-file) for each router instance.

docs/deploying_clearml/enterprise_deploy/appgw_install_k8s.md

+60-45
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,26 @@ title: Kubernetes Deployment
33
---
44

55
:::important Enterprise Feature
6-
The Application Gateway is available under the ClearML Enterprise plan.
6+
The AI Application Gateway is available under the ClearML Enterprise plan.
7+
:::
8+
9+
This guide details the installation of the ClearML App Gateway Router.
10+
The App Gateway Router enables access to your AI workload applications (e.g. remote IDEs like VSCode and Jupyter, model API interface, etc.).
11+
It acts as a proxy, identifying ClearML Tasks running within its [K8s namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/)
12+
and making them available for network access.
13+
14+
:::important
15+
The App Gateway Router must be installed in the same K8s namespace as a dedicated ClearML Agent.
16+
It can only configure access for ClearML Tasks within its own namespace.
717
:::
818

9-
This guide details the installation of the ClearML AI Application Gateway, specifically the ClearML Task Router Component.
1019

1120
## Requirements
1221

1322
* Kubernetes cluster: `>= 1.21.0-0 < 1.32.0-0`
1423
* Helm installed and configured
15-
* Helm token to access `allegroai` helm-chart repo
16-
* Credentials for `allegroai` docker repo
24+
* Helm token to access `clearml` helm-chart repo
25+
* Credentials for `clearml` docker repo
1726
* A valid ClearML Server installation
1827

1928
## Optional for HTTPS
@@ -26,62 +35,55 @@ This guide details the installation of the ClearML AI Application Gateway, speci
2635
### Login
2736

2837
```
29-
helm repo add allegroai-enterprise \
38+
helm repo add clearml-enterprise \
3039
https://raw.githubusercontent.com/clearml/clearml-enterprise-helm-charts/gh-pages \
3140
--username <GITHUB_TOKEN> \
3241
--password <GITHUB_TOKEN>
3342
```
3443

44+
Replace `<GITHUB_TOKEN>` with your valid GitHub token that has access to the ClearML Enterprise Helm charts repository.
45+
3546
### Prepare Values
3647

37-
Before installing the TTR, create a `helm-override` files named `task-traffic-router.values-override.yaml`:
48+
Before installing the App Gateway Router, create a Helm override file:
3849

3950
```
4051
imageCredentials:
41-
password: "<DOCKERHUB_TOKEN>"
52+
password: ""
4253
clearml:
43-
apiServerKey: ""
44-
apiServerSecret: ""
45-
apiServerUrlReference: "https://api."
46-
jwksKey: ""
47-
authCookieName: ""
54+
apiServerKey: ""
55+
apiServerSecret: ""
56+
apiServerUrlReference: ""
57+
authCookieName: ""
58+
sslVerify: true
4859
ingress:
49-
enabled: true
50-
hostName: "task-router.dev"
60+
enabled: true
61+
hostName: ""
5162
tcpSession:
52-
routerAddress: ""
53-
portRange:
54-
start:
55-
end:
63+
routerAddress: ""
64+
service:
65+
type: LoadBalancer
66+
portRange:
67+
start:
68+
end:
5669
```
5770

58-
Edit it accordingly to these guidelines:
59-
60-
* `clearml.apiServerUrlReference`: URL usually starting with `https://api.`
61-
* `clearml.apiServerKey`: ClearML server api key
62-
* `clearml.apiServerSecret`: ClearML server secret key
63-
* `ingress.hostName`: URL of router we configured previously for load balancer starting with `https://`
64-
* `clearml.sslVerify`: Enable or disable SSL certificate validation on apiserver calls check
65-
* `clearml.authCookieName`: Value from `value_prefix` key starting with `allegro_token` in `envoy.yaml` file in ClearML server installation.
66-
* `clearml.jwksKey`: Value form `k` key in `jwks.json` file in ClearML server installation (see below)
67-
* `tcpSession.routerAddress`: Router external address can be an IP or the host machine or a load balancer hostname, depends on the network configuration
68-
* `tcpSession.portRange.start`: Start port for the TCP Session feature
69-
* `tcpSession.portRange.end`: End port for the TCP Session feature
70-
71-
:::note How to find my jwkskey
71+
Configuration options:
7272

73-
The *JSON Web Key Set* (*JWKS*) is a set of keys containing the public keys used to verify any JSON Web Token (JWT).
74-
75-
```
76-
kubectl -n clearml get secret clearml-conf \
77-
-o jsonpath='{.data.secure_auth_token_secret}' \
78-
| base64 -d && echo
79-
```
80-
81-
:::
73+
* `imageCredentials.password`: ClearML DockerHub Access Token.
74+
* `clearml.apiServerKey`: ClearML server API key.
75+
* `clearml.apiServerSecret`: ClearML server secret key.
76+
* `clearml.apiServerUrlReference`: ClearML API server URL starting with `https://api.`.
77+
* `clearml.authCookieName`: Cookie used by the ClearML server to store the ClearML authentication cookie.
78+
* `clearml.sslVerify`: Enable or disable SSL certificate validation on `apiserver` calls check.
79+
* `ingress.hostName`: Hostname of router used by the ingress controller to access it.
80+
* `tcpSession.routerAddress`: The external router address (can be an IP, hostname, or load balancer address) depending on your network setup. Ensure this address is accessible for TCP connections.
81+
* `tcpSession.service.type`: Service type used to expose TCP functionality, default is `NodePort`.
82+
* `tcpSession.portRange.start`: Start port for the TCP Session feature.
83+
* `tcpSession.portRange.end`: End port for the TCP Session feature.
8284

8385

84-
The whole list of supported configuration is available with the command:
86+
The full list of supported configuration is available with the command:
8587

8688
```
8789
helm show readme allegroai-enterprise/clearml-enterprise-task-traffic-router
@@ -94,9 +96,22 @@ To install the TTR component via Helm use the following command:
9496
```
9597
helm upgrade --install \
9698
<RELEASE_NAME> \
97-
-n <NAME_SPACE> \
99+
-n <WORKLOAD_NAMESPACE> \
98100
allegroai-enterprise/clearml-enterprise-task-traffic-router \
99-
--version <CURRENT CHART VERSION> \
100-
-f task-traffic-router.values-override.yaml
101+
--version <CHART_VERSION> \
102+
-f override.yaml
101103
```
102104

105+
Replace the placeholders with the following values:
106+
107+
* `<RELEASE_NAME>` - Unique name for the App Gateway Router within the K8s namespace. This is a required parameter in
108+
Helm, which identifies a specific installation of the chart. The release name also defines the router’s name and
109+
appears in the UI within AI workload application URLs (e.g. Remote IDE URLs). This can be customized to support multiple installations within the same
110+
namespace by assigning different release names.
111+
* `<WORKLOAD_NAMESPACE>` - [Kubernetes Namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/)
112+
where workloads will be executed. This namespace must be shared between a dedicated ClearML Agent and an App
113+
Gateway Router. The agent is responsible for monitoring its assigned task queues and spawning workloads within this
114+
namespace. The router monitors the same namespace for AI workloads (e.g. remote IDE applications). The router has a
115+
namespace-limited scope, meaning it can only detect and manage tasks within its
116+
assigned namespace.
117+
* `<CHART_VERSION>` - Version recommended by the ClearML Support Team.

docs/deploying_clearml/enterprise_deploy/multi_tenant_k8s.md

+5-6
Original file line numberDiff line numberDiff line change
@@ -513,31 +513,30 @@ Create a `NetworkPolicy` in the tenant namespace with the following configuratio
513513
- podSelector: {}
514514
```
515515

516-
### Install Task Traffic Router Chart
516+
### Install the App Gateway Router Chart
517517

518-
Install the [Task Traffic Router](appgw.md) in your Kubernetes cluster, allowing it to manage and route tasks:
518+
Install the App Gateway Router in your Kubernetes cluster, allowing it to manage and route tasks:
519519

520520
1. Prepare the `overrides.yaml` file with the following content:
521521

522522
```
523523
imageCredentials:
524-
password: "<allegroaienterprise_DockerHub_TOKEN>"
524+
password: "<clearmlenterprise_DockerHub_TOKEN>"
525525
clearml:
526526
apiServerUrlReference: "<http://clearml-enterprise-apiserver.clearml:8008>"
527527
apiserverKey: "<TENANT_KEY>"
528528
apiserverSecret: "<TENANT_SECRET>"
529-
jwksKey: "<JWKS_KEY>"
530529
ingress:
531530
enabled: true
532531
hostName: "<unique url in same domain as apiserver/webserver>"
533532
```
534533

535-
2. Install Task Traffic Router in the specified tenant namespace:
534+
2. Install App Gateway Router in the specified tenant namespace:
536535

537536
```
538537
helm install -n <TENANT_NAMESPACE> \\
539538
clearml-ttr \\
540-
allegroai-enterprise/clearml-task-traffic-router \\
539+
clearml-enterprise/clearml-task-traffic-router \\
541540
--create-namespace \\
542541
-f overrides.yaml
543542
```

docs/webapp/applications/apps_embed_model_deployment.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ running, it serves your embedding model through a secure, publicly accessible ne
1313
endpoint activity and shuts down if the model remains inactive for a specified maximum idle time.
1414

1515
:::info AI Application Gateway
16-
The Embedding Model Deployment app makes use of the ClearML Traffic Router which implements a secure, authenticated
16+
The Embedding Model Deployment app makes use of the App Gateway Router which implements a secure, authenticated
1717
network endpoint for the model.
1818

1919
If the ClearML AI application Gateway is not available, the model endpoint might not be accessible.

0 commit comments

Comments
 (0)