Skip to content

Commit

Permalink
Restructured and standardized READMEs
Browse files Browse the repository at this point in the history
  • Loading branch information
arueth committed Nov 8, 2024
1 parent 38f0b4e commit f2c0f8b
Show file tree
Hide file tree
Showing 47 changed files with 2,074 additions and 535 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ __pycache__/
.venv/
venv/

# Repositories
monitoring-dashboard-samples/

# Terraform
*.terraform/
*.terraform-*/
Expand Down
98 changes: 98 additions & 0 deletions use-cases/inferencing/benchmark/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Benchmarking with Locust

We can run inference benchmark on our deployed model using locust.
Locust is an open source performance/load testing tool for HTTP and other protocols.
Refer to the documentation to [set up](https://docs.locust.io/en/stable/installation.html) locust locally or deploy as a container on GKE.

## Pre-requisites

- A model is deployed using one of the vLLM guides
- [Serving the mode using vLLM and GCSFuse](/use-cases/inferencing/serving/vllm/gcsfuse/README.md)
- [Serving the mode using vLLM and Persistent Disk](/use-cases/inferencing/serving/vllm/persistent-disk/README.md)
- Metrics are being scraped from the vLLM server ss shown in the [vLLM Metrics](/use-cases/inferencing/serving/vllm/metrics/README.md) guide.

## Preparation

- Clone the repository

```sh
git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \
cd accelerated-platforms
```

- Change directory to the guide directory

```sh
cd use-cases/inferencing/benchmark
```

- Ensure that your `MLP_ENVIRONMENT_FILE` is configured

```sh
cat ${MLP_ENVIRONMENT_FILE} && \
source ${MLP_ENVIRONMENT_FILE}
```

> You should see the various variables populated with the information specific to your environment.
#### Build the image of the source and execute bencmark job

- Build container image using Cloud Build and push the image to Artifact Registry.

```sh
cd src
sed -i -e "s|^serviceAccount:.*|serviceAccount: projects/${MLP_PROJECT_ID}/serviceAccounts/${MLP_BUILD_GSA}|" cloudbuild.yaml
gcloud beta builds submit \
--config cloudbuild.yaml \
--gcs-source-staging-dir gs://${MLP_CLOUDBUILD_BUCKET}/source \
--project ${MLP_PROJECT_ID} \
--substitutions _DESTINATION=${MLP_BENCHMARK_IMAGE}
cd -
```

- Configure the environment

| Variable | Description | Example |
| --------------- | ------------------------------------------------------------------------------ | ------------ |
| MODEL_NAME | The name of the model folder in the root of the GCS model bucket | model-gemma2 |
| MODEL_VERSION | The name of the version folder inside the model folder of the GCS model bucket | experiment |
| SERVE_NAMESPACE | Namespace where the model will be served | ml-serve |

```sh
MODEL_NAME=model-gemma2
MODEL_VERSION=experiment
ENDPOINT="http://vllm-openai:8000/v1/chat/completions"
HOST="http://vllm-openai:8000/"
SERVE_NAMESPACE=ml-serve
```

```sh
BENCHMARK_MODEL_PATH=/local/${MODEL_ID}/${MODEL_PATH}
```

- Replace variables in inference job manifest and deploy the job

```sh
sed -i -e "s|V_IMAGE_URL|${MLP_BENCHMARK_IMAGE}|" \
-i -e "s|V_KSA|${MLP_SERVE_KSA}|" \
-i -e "s|V_BENCHMARK_MODEL_PATH|${BENCHMARK_MODEL_PATH}|" \
-i -e "s|V_ENDPOINT|${ENDPOINT}|" \
-i -e "s|V_HOST|${HOST}|" \
manifests/locust-master-controller.yaml \
manifests/locust-master-service.yaml \
manifests/locust-worker-controller.yaml
```

```
kubectl --namespace ${SERVE_NAMESPACE} apply -f manifests
```

- Access the locust dashboard and launch swarming requests.

```sh
echo $MLP_LOCUST_NAMESPACE_ENDPOINT
```

> Note : Locust service make take up to 5 minutes to load completely.
- Paste the locust endpoint obtained above in a browser to open the chat interface to your deployed model.
69 changes: 69 additions & 0 deletions use-cases/inferencing/benchmark/manifests/benchmark.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Copyright 2024 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

---
apiVersion: batch/v1
kind: Deployment
metadata:
name: benchmark
spec:
backoffLimit: 10
template:
metadata:
labels:
app: benchmark-job
spec:
containers:
- args:
- '-c'
- |
ACTION=benchmark python3 /locustfile.py
command: ["/bin/sh"]
env:
- name: ENDPOINT
value: _ENDPOINT_
- name: MODEL_ID
value: _BENCHMARK_MODEL_PATH_
- name: HOST
value: _HOST_
image: _IMAGE_URL_
imagePullPolicy: Always
name: job
resources:
limits:
cpu: '2'
memory: 5Gi
requests:
cpu: '2'
memory: 5Gi
restartPolicy: Never
serviceAccountName: _KSA_
---
apiVersion: v1
kind: Service
metadata:
annotations:
networking.gke.io/load-balancer-type: Internal
labels:
app: benchmark
name: locust-web
spec:
ports:
- name: loc-master-web
port: 8089
protocol: TCP
targetPort: loc-master-web
selector:
app: benchmark
type: LoadBalancer
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Copyright 2024 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


apiVersion: apps/v1
kind: Deployment
metadata:
labels:
name: locust-master
name: locust-master
spec:
replicas: 1
selector:
matchLabels:
app: locust-master
template:
metadata:
labels:
app: locust-master
spec:
containers:
- env:
- name: ENDPOINT
value: V _ENDPOINT
- name: MODEL_ID
value: V_BENCHMARK_MODEL_PATH
- name: LOCUST_MODE
value: master
- name: TARGET_HOST
value: V_HOST
image: V_IMAGE_URL
name: locust-master
ports:
- containerPort: 8089
name: loc-master-web
protocol: TCP
- containerPort: 5557
name: loc-master-p1
protocol: TCP
- containerPort: 5558
name: loc-master-p2
protocol: TCP
serviceAccountName: V_KSA
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Copyright 2024 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: v1
kind: Service
metadata:
labels:
app: locust-master
name: locust-master
spec:
ports:
- name: loc-master-p1
port: 5557
protocol: TCP
targetPort: loc-master-p1
- name: loc-master-p2
port: 5558
protocol: TCP
targetPort: loc-master-p2
selector:
app: locust-master
---
apiVersion: v1
kind: Service
metadata:
labels:
app: locust-master
name: locust-master-web-svc
spec:
ports:
- name: loc-master-web
port: 8089
protocol: TCP
targetPort: loc-master-web
selector:
app: locust-master
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Copyright 2022 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: apps/v1
kind: Deployment
metadata:
labels:
name: locust-worker
name: locust-worker
spec:
replicas: 5
selector:
matchLabels:
app: locust-worker
template:
metadata:
labels:
app: locust-worker
spec:
containers:
- env:
- name: ENDPOINT
value: V_ENDPOINT
- name: MODEL_ID
value: V_BENCHMARK_MODEL_PATH
- name: LOCUST_MODE
value: worker
- name: LOCUST_MASTER
value: locust-master
- name: TARGET_HOST
value: V_HOST
image: V_IMAGE_URL
name: locust-worker
serviceAccountName: V_KSA
35 changes: 35 additions & 0 deletions use-cases/inferencing/benchmark/src/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Copyright 2022 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


# Start with a base image Python 3.9.12 Debian 11 (bullseye) slim
FROM python:3.9.12-slim-bullseye

# Add the licenses for third party software and libraries
ADD licenses /licenses

# Add the external tasks directory into /tasks
ADD locust-tasks /locust-tasks

# Install the required dependencies via pip
RUN pip install -r /locust-tasks/requirements.txt

# Expose the required Locust ports
EXPOSE 5557 5558 8089

# Set script to be executable
RUN chmod 755 /locust-tasks/run.sh

# Start Locust using LOCUS_OPTS environment variable
ENTRYPOINT ["/locust-tasks/run.sh"]
12 changes: 12 additions & 0 deletions use-cases/inferencing/benchmark/src/cloudbuild.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
images:
- ${_DESTINATION}
options:
logging: CLOUD_LOGGING_ONLY
serviceAccount:
steps:
- name: 'gcr.io/cloud-builders/docker'
args:
- build
- -t
- ${_DESTINATION}
- .
Loading

0 comments on commit f2c0f8b

Please sign in to comment.