From 8ba47b19fd224ccf63b52a8974ca22ebebde39e6 Mon Sep 17 00:00:00 2001 From: Ben Broderick Phillips Date: Wed, 24 Aug 2022 00:51:01 -0400 Subject: [PATCH] Update stress readme to remove misleading sections --- tools/stress-cluster/chaos/README.md | 187 ++++++--------------------- 1 file changed, 38 insertions(+), 149 deletions(-) diff --git a/tools/stress-cluster/chaos/README.md b/tools/stress-cluster/chaos/README.md index 43db83fc6a8..8ec19195d0b 100644 --- a/tools/stress-cluster/chaos/README.md +++ b/tools/stress-cluster/chaos/README.md @@ -5,8 +5,7 @@ The chaos environment is an AKS cluster (Azure Kubernetes Service) with several # Table of Contents * [Installation](#installation) - * [Access](#access) - * [Quick Testing with no Dependencies](#quick-testing-with-no-dependencies) + * [Deploying a Stress Test](#deploying-a-stress-test) * [Creating a Stress Test](#creating-a-stress-test) * [Layout](#layout) * [Stress Test Metadata](#stress-test-metadata) @@ -20,7 +19,6 @@ The chaos environment is an AKS cluster (Azure Kubernetes Service) with several * [Chaos Manifest](#chaos-manifest) * [Scenarios and values.yaml](#scenarios-and-valuesyaml) * [Node Size Requirements](#node-size-requirements) - * [Deploying a Stress Test](#deploying-a-stress-test) * [Configuring faults](#configuring-faults) * [Faults via Dashboard](#faults-via-dashboard) * [Faults via Config](#faults-via-config) @@ -44,116 +42,80 @@ You will need the following tools to create and run tests: 1. [Azure CLI](https://docs.microsoft.com/cli/azure/install-azure-cli) 1. [Powershell Core](https://docs.microsoft.com/powershell/scripting/install/installing-powershell-core-on-linux?view=powershell-7.1#ubuntu-2004) (if using Linux) -## Access - -To access the cluster, run the following. These commands are unnecessary for stress test deployment but can be useful -for verifying permissions and directly interacting with containers via the kubernetes command line tool `kubectl`. For -running the build and deployment script, see [Deploying a Stress Test](#deploying-a-stress-test). - -```bash -# Authenticate to Azure -az login - -# Download the kubeconfig for the cluster (creates a 'context' named 'stress-pg') -az aks get-credentials --subscription "Azure SDK Developer Playground" -g rg-stress-cluster-pg -n stress-pg -``` - -You should now be able to access the cluster. To verify, you should see a list of namespaces when running the command: - -``` -kubectl get namespaces -``` - -## Quick Testing with no Dependencies - -This section details how to deploy a simple job, without any dependencies on the cluster (e.g. azure credentials, app insights keys) or stress test scripts. It is used to illustrate how kubernetes and the tools work only. Stress test development should be done using the [deploy script](https://github.com/Azure/azure-sdk-tools/blob/main/eng/common/scripts/stress-testing/deploy-stress-tests.ps1). +## Deploying a Stress Test -To get started, you will need to create a container image containing your long-running test, and a manifest to execute that image as a [kubernetes job](https://kubernetes.io/docs/concepts/workloads/controllers/job/). +The stress test deployment is best run via the [stress test deploy +script](https://github.com/Azure/azure-sdk-tools/blob/main/eng/common/scripts/stress-testing/deploy-stress-tests.ps1). +This script handles: cluster and container registry access, building the stress test helm package, installing helm +package dependencies, and building and pushing docker images. The script must be run via powershell or powershell core. -The Dockerfile for your image should contain your test code/artifacts. See [docs on how to create a Dockerfile](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) +If using bash or another linux terminal, a [powershell core](https://docs.microsoft.com/powershell/scripting/install/installing-powershell-core-on-linux?view=powershell-7.1#ubuntu-2004) shell can be invoked via `pwsh`. -To create any resources in the cluster, you will need to create a namespace for them to live in: +The first invocation of the script must be run with the `-Login` flag to set up cluster and container registry access. -```bash -# For simplicity of tracking use your user alias as the name of your namespace. -kubectl create namespace ``` +cd -You will then need to build and push your container image to an Azure Container Registry the cluster has access to. - -Get the default container registry for the stress testing Kubernetes cluster: - -```bash -az acr list -g rg-stress-cluster-pg --subscription "Azure SDK Developer Playground" --query "[0].loginServer" -# Outputs: +/eng/common/scripts/stress-testing/deploy-stress-tests.ps1 ` + -Login ` + -PushImages ``` -Login to the azure container registry. The below command will add a token to the registy to the local docker config. This must be refreshed daily. +To re-deploy more quickly, the script can be run without `-Login` and/or without `-PushImages` (if no code changes were +made). ``` -az acr login -n +/eng/common/scripts/stress-testing/deploy-stress-tests.ps1 ``` -Build and push development image to stress test cluster registry +To run multiple instances of the same test in parallel, add a different namespace override +for each test deployment. If not specified, it will default to the shell username when run locally. -```bash -docker build . -t "//:" -docker push "//:" +``` +/eng/common/scripts/stress-testing/deploy-stress-tests.ps1 ` + -Namespace my-test-instance-2 ` ``` -To define a job that utilizes your test, create a file called testjob.yaml, including the below contents (with fields replaced): +You can check the progress/status of your installation via: ``` -apiVersion: batch/v1 -kind: Job -metadata: - name: - namespace: -spec: - template: - spec: - containers: - - name: - image: - imagePullPolicy: Always - command: ["test entrypoint command/binary"] - args: [] - restartPolicy: Never - backoffLimit: 1 +helm list -n ``` -To submit your test job, run: +To debug the kubernetes manifests installed by the stress test, run the following from the stress test directory: ``` -# Submit/re-submit the test -kubectl replace --force -f testjob.yaml +helm template . ``` -To view the status of your test: +To stop and remove the test: ``` -kubectl get jobs -n +helm uninstall -n ``` -If there are any errors (whether due to configuration or commands): +To check the status of the stress test containers: ``` -kubectl describe pods -n -l job-name= -``` +# List stress test pods +kubectl get pods -n -l release= -To view the logs from your test: +# Get logs from the init-azure-deployer init container, if deploying resources. Omit `-c init-azure-deployer` to get main container +logs. +kubectl logs -n -c init-azure-deployer -``` -# Append -f to tail the logs -kubectl logs -n -l job-name= +# If empty, there may have been startup failures +kubectl describe pod -n ``` -To delete your test: +If deploying resources, once the `init-azure-deployer` init container is completed and the stress test pod is in a `Running` state, +you can quick check the local logs: ``` -kubectl delete -f testjob.yaml +kubectl logs -n ``` + ## Creating a Stress Test This section details how to create a formal stress test which creates azure resource deployments and publishes telemetry. @@ -500,79 +462,6 @@ az aks nodepool add \ --labels "sku=" ``` -## Deploying a Stress Test - -The stress test deployment is best run via the [stress test deploy -script](https://github.com/Azure/azure-sdk-tools/blob/main/eng/common/scripts/stress-testing/deploy-stress-tests.ps1). -This script handles: cluster and container registry access, building the stress test helm package, installing helm -package dependencies, and building and pushing docker images. The script must be run via powershell or powershell core. - -If using bash or another linux terminal, a [powershell core](https://docs.microsoft.com/powershell/scripting/install/installing-powershell-core-on-linux?view=powershell-7.1#ubuntu-2004) shell can be invoked via `pwsh`. - -The first invocation of the script must be run with the `-Login` flag to set up cluster and container registry access. - -``` -cd - -/eng/common/scripts/stress-testing/deploy-stress-tests.ps1 ` - -Login ` - -PushImages -``` - -To re-deploy more quickly, the script can be run without `-Login` and/or without `-PushImages` (if no code changes were -made). - -``` -/eng/common/scripts/stress-testing/deploy-stress-tests.ps1 -``` - -To run multiple instances of the same test in parallel, add a different namespace override -for each test deployment. If not specified, it will default to the shell username when run locally. - -``` -/eng/common/scripts/stress-testing/deploy-stress-tests.ps1 ` - -Namespace my-test-instance-2 ` -``` - -You can check the progress/status of your installation via: - -``` -helm list -n -``` - -To debug the kubernetes manifests installed by the stress test, run the following from the stress test directory: - -``` -helm template . -``` - -To stop and remove the test: - -``` -helm uninstall -n -``` - -To check the status of the stress test job resources: - -``` -# List stress test pods -kubectl get pods -n -l release= - -# Get logs from the init-azure-deployer init container, if deploying resources. Omit `-c init-azure-deployer` to get main container -logs. -kubectl logs -n -c init-azure-deployer - -# If empty, there may have been startup failures -kubectl describe pod -n -``` - -If deploying resources, once the `init-azure-deployer` init container is completed and the stress test pod is in a `Running` state, -you can quick check the local logs: - -``` -kubectl logs -n -``` - ## Configuring faults Faults can be configured via kubernetes manifests or via the UI (which is a helper for building the manifests under the hood). For docs on the manifest schema, see [here](https://chaos-mesh.org/docs/define-chaos-experiment-scope/).