-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Restructured and standardized READMEs
- Loading branch information
Showing
47 changed files
with
2,074 additions
and
535 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
# Benchmarking with Locust | ||
|
||
We can run inference benchmark on our deployed model using locust. | ||
Locust is an open source performance/load testing tool for HTTP and other protocols. | ||
Refer to the documentation to [set up](https://docs.locust.io/en/stable/installation.html) locust locally or deploy as a container on GKE. | ||
|
||
## Pre-requisites | ||
|
||
- A model is deployed using one of the vLLM guides | ||
- [Serving the mode using vLLM and GCSFuse](/use-cases/inferencing/serving/vllm/gcsfuse/README.md) | ||
- [Serving the mode using vLLM and Persistent Disk](/use-cases/inferencing/serving/vllm/persistent-disk/README.md) | ||
- Metrics are being scraped from the vLLM server ss shown in the [vLLM Metrics](/use-cases/inferencing/serving/vllm/metrics/README.md) guide. | ||
|
||
## Preparation | ||
|
||
- Clone the repository | ||
|
||
```sh | ||
git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \ | ||
cd accelerated-platforms | ||
``` | ||
|
||
- Change directory to the guide directory | ||
|
||
```sh | ||
cd use-cases/inferencing/benchmark | ||
``` | ||
|
||
- Ensure that your `MLP_ENVIRONMENT_FILE` is configured | ||
|
||
```sh | ||
cat ${MLP_ENVIRONMENT_FILE} && \ | ||
source ${MLP_ENVIRONMENT_FILE} | ||
``` | ||
|
||
> You should see the various variables populated with the information specific to your environment. | ||
#### Build the image of the source and execute bencmark job | ||
|
||
- Build container image using Cloud Build and push the image to Artifact Registry. | ||
|
||
```sh | ||
cd src | ||
sed -i -e "s|^serviceAccount:.*|serviceAccount: projects/${MLP_PROJECT_ID}/serviceAccounts/${MLP_BUILD_GSA}|" cloudbuild.yaml | ||
gcloud beta builds submit \ | ||
--config cloudbuild.yaml \ | ||
--gcs-source-staging-dir gs://${MLP_CLOUDBUILD_BUCKET}/source \ | ||
--project ${MLP_PROJECT_ID} \ | ||
--substitutions _DESTINATION=${MLP_BENCHMARK_IMAGE} | ||
cd - | ||
``` | ||
|
||
- Configure the environment | ||
|
||
| Variable | Description | Example | | ||
| --------------- | ------------------------------------------------------------------------------ | ------------ | | ||
| MODEL_NAME | The name of the model folder in the root of the GCS model bucket | model-gemma2 | | ||
| MODEL_VERSION | The name of the version folder inside the model folder of the GCS model bucket | experiment | | ||
| SERVE_NAMESPACE | Namespace where the model will be served | ml-serve | | ||
|
||
```sh | ||
MODEL_NAME=model-gemma2 | ||
MODEL_VERSION=experiment | ||
ENDPOINT="http://vllm-openai:8000/v1/chat/completions" | ||
HOST="http://vllm-openai:8000/" | ||
SERVE_NAMESPACE=ml-serve | ||
``` | ||
|
||
```sh | ||
BENCHMARK_MODEL_PATH=/local/${MODEL_ID}/${MODEL_PATH} | ||
``` | ||
|
||
- Replace variables in inference job manifest and deploy the job | ||
|
||
```sh | ||
sed -i -e "s|V_IMAGE_URL|${MLP_BENCHMARK_IMAGE}|" \ | ||
-i -e "s|V_KSA|${MLP_SERVE_KSA}|" \ | ||
-i -e "s|V_BENCHMARK_MODEL_PATH|${BENCHMARK_MODEL_PATH}|" \ | ||
-i -e "s|V_ENDPOINT|${ENDPOINT}|" \ | ||
-i -e "s|V_HOST|${HOST}|" \ | ||
manifests/locust-master-controller.yaml \ | ||
manifests/locust-master-service.yaml \ | ||
manifests/locust-worker-controller.yaml | ||
``` | ||
|
||
``` | ||
kubectl --namespace ${SERVE_NAMESPACE} apply -f manifests | ||
``` | ||
|
||
- Access the locust dashboard and launch swarming requests. | ||
|
||
```sh | ||
echo $MLP_LOCUST_NAMESPACE_ENDPOINT | ||
``` | ||
|
||
> Note : Locust service make take up to 5 minutes to load completely. | ||
- Paste the locust endpoint obtained above in a browser to open the chat interface to your deployed model. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# Copyright 2024 Google Inc. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
--- | ||
apiVersion: batch/v1 | ||
kind: Deployment | ||
metadata: | ||
name: benchmark | ||
spec: | ||
backoffLimit: 10 | ||
template: | ||
metadata: | ||
labels: | ||
app: benchmark-job | ||
spec: | ||
containers: | ||
- args: | ||
- '-c' | ||
- | | ||
ACTION=benchmark python3 /locustfile.py | ||
command: ["/bin/sh"] | ||
env: | ||
- name: ENDPOINT | ||
value: _ENDPOINT_ | ||
- name: MODEL_ID | ||
value: _BENCHMARK_MODEL_PATH_ | ||
- name: HOST | ||
value: _HOST_ | ||
image: _IMAGE_URL_ | ||
imagePullPolicy: Always | ||
name: job | ||
resources: | ||
limits: | ||
cpu: '2' | ||
memory: 5Gi | ||
requests: | ||
cpu: '2' | ||
memory: 5Gi | ||
restartPolicy: Never | ||
serviceAccountName: _KSA_ | ||
--- | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
annotations: | ||
networking.gke.io/load-balancer-type: Internal | ||
labels: | ||
app: benchmark | ||
name: locust-web | ||
spec: | ||
ports: | ||
- name: loc-master-web | ||
port: 8089 | ||
protocol: TCP | ||
targetPort: loc-master-web | ||
selector: | ||
app: benchmark | ||
type: LoadBalancer |
54 changes: 54 additions & 0 deletions
54
use-cases/inferencing/benchmark/manifests/locust-master-controller.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# Copyright 2024 Google Inc. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
labels: | ||
name: locust-master | ||
name: locust-master | ||
spec: | ||
replicas: 1 | ||
selector: | ||
matchLabels: | ||
app: locust-master | ||
template: | ||
metadata: | ||
labels: | ||
app: locust-master | ||
spec: | ||
containers: | ||
- env: | ||
- name: ENDPOINT | ||
value: V _ENDPOINT | ||
- name: MODEL_ID | ||
value: V_BENCHMARK_MODEL_PATH | ||
- name: LOCUST_MODE | ||
value: master | ||
- name: TARGET_HOST | ||
value: V_HOST | ||
image: V_IMAGE_URL | ||
name: locust-master | ||
ports: | ||
- containerPort: 8089 | ||
name: loc-master-web | ||
protocol: TCP | ||
- containerPort: 5557 | ||
name: loc-master-p1 | ||
protocol: TCP | ||
- containerPort: 5558 | ||
name: loc-master-p2 | ||
protocol: TCP | ||
serviceAccountName: V_KSA |
47 changes: 47 additions & 0 deletions
47
use-cases/inferencing/benchmark/manifests/locust-master-service.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# Copyright 2024 Google Inc. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
labels: | ||
app: locust-master | ||
name: locust-master | ||
spec: | ||
ports: | ||
- name: loc-master-p1 | ||
port: 5557 | ||
protocol: TCP | ||
targetPort: loc-master-p1 | ||
- name: loc-master-p2 | ||
port: 5558 | ||
protocol: TCP | ||
targetPort: loc-master-p2 | ||
selector: | ||
app: locust-master | ||
--- | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
labels: | ||
app: locust-master | ||
name: locust-master-web-svc | ||
spec: | ||
ports: | ||
- name: loc-master-web | ||
port: 8089 | ||
protocol: TCP | ||
targetPort: loc-master-web | ||
selector: | ||
app: locust-master |
45 changes: 45 additions & 0 deletions
45
use-cases/inferencing/benchmark/manifests/locust-worker-controller.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# Copyright 2022 Google Inc. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
labels: | ||
name: locust-worker | ||
name: locust-worker | ||
spec: | ||
replicas: 5 | ||
selector: | ||
matchLabels: | ||
app: locust-worker | ||
template: | ||
metadata: | ||
labels: | ||
app: locust-worker | ||
spec: | ||
containers: | ||
- env: | ||
- name: ENDPOINT | ||
value: V_ENDPOINT | ||
- name: MODEL_ID | ||
value: V_BENCHMARK_MODEL_PATH | ||
- name: LOCUST_MODE | ||
value: worker | ||
- name: LOCUST_MASTER | ||
value: locust-master | ||
- name: TARGET_HOST | ||
value: V_HOST | ||
image: V_IMAGE_URL | ||
name: locust-worker | ||
serviceAccountName: V_KSA |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Copyright 2022 Google Inc. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
# Start with a base image Python 3.9.12 Debian 11 (bullseye) slim | ||
FROM python:3.9.12-slim-bullseye | ||
|
||
# Add the licenses for third party software and libraries | ||
ADD licenses /licenses | ||
|
||
# Add the external tasks directory into /tasks | ||
ADD locust-tasks /locust-tasks | ||
|
||
# Install the required dependencies via pip | ||
RUN pip install -r /locust-tasks/requirements.txt | ||
|
||
# Expose the required Locust ports | ||
EXPOSE 5557 5558 8089 | ||
|
||
# Set script to be executable | ||
RUN chmod 755 /locust-tasks/run.sh | ||
|
||
# Start Locust using LOCUS_OPTS environment variable | ||
ENTRYPOINT ["/locust-tasks/run.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
images: | ||
- ${_DESTINATION} | ||
options: | ||
logging: CLOUD_LOGGING_ONLY | ||
serviceAccount: | ||
steps: | ||
- name: 'gcr.io/cloud-builders/docker' | ||
args: | ||
- build | ||
- -t | ||
- ${_DESTINATION} | ||
- . |
Oops, something went wrong.