Set up the Autoscaler using Terraform configuration files
Home
·
Scaler component
·
Poller component
·
Forwarder component
·
Terraform configuration
·
Monitoring
Cloud Functions
·
Google Kubernetes Engine
- Table of Contents
- Overview
- Architecture
- Before you begin
- Preparing the Autoscaler Project
- Deploying the Autoscaler
- Importing your Spanner instances
- Building and Deploying the Autoscaler Services
This directory contains Terraform configuration files to quickly set up the infrastructure for your Autoscaler for a deployment to Google Kubernetes Engine (GKE).
In this deployment option, all the components of the Autoscaler reside in the same project as your Spanner instances. A future enhancement may enable the autoscaler to operate cross-project when running in GKE.
This deployment is ideal for independent teams who want to self-manage the infrastructure and configuration of their own Autoscalers on Kubernetes.
-
Using a Kubernetes ConfigMap you define which Spanner instances you would like to be managed by the autoscaler. Currently these must be in the same project as the cluster that runs the autoscaler.
-
Using a Kubernetes CronJob, the autoscaler is configured to run on a schedule. By default this is every minute, though this is configurable.
-
When scheduled, an instance of the Poller is created as a Kubernetes Job.
-
The Poller queries the Cloud Monitoring API to retrieve the utilization metrics for each Spanner instance.
-
For each Spanner instance, the Poller makes a call to the Scaler via its API. The request payload contains the utilization metrics for the specific Spanner instance, and some of its corresponding configuration parameters.
-
Using the chosen scaling method, the Scaler compares the Spanner instance metrics against the recommended thresholds, plus or minus an allowed margin and determines if the instance should be scaled, and the number of nodes or processing units that it should be scaled to.
-
The Scaler retrieves the time when the instance was last scaled from the state data stored in Cloud Firestore (or alternatively Spanner) and compares it with the current time.
-
If the configured cooldown period has passed, then the Scaler requests the Spanner Instance to scale out or in.
The GKE deployment has the following pros and cons:
- Kubernetes-based: For teams that may not be able to use Google Cloud services such as Cloud Functions, this design enables the use of the autoscaler.
- Configuration: The control over scheduler parameters belongs to the team that owns the Spanner instance, therefore the team has the highest degree of freedom to adapt the Autoscaler to its needs.
- Infrastructure: This design establishes a clear boundary of responsibility and security over the Autoscaler infrastructure because the team owner of the Spanner instances is also the owner of the Autoscaler infrastructure.
- Infrastructure: In contrast to the Cloud Functions design, some long-lived infrastructure and services are required.
- Maintenance: with each team being responsible for the Autoscaler configuration and infrastructure it may become difficult to make sure that all Autoscalers across the company follow the same update guidelines.
- Audit: because of the high level of control by each team, a centralized audit may become more complex.
In this section you prepare your environment.
-
Open the Cloud Console
-
Activate Cloud Shell
At the bottom of the Cloud Console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Cloud SDK already installed, including thegcloud
command-line tool, and with values already set for your current project. It can take a few seconds for the session to initialize. -
In Cloud Shell, clone this repository:
git clone https://github.com/cloudspannerecosystem/autoscaler.git
-
Export variables for the working directories:
export AUTOSCALER_ROOT="$(pwd)/autoscaler" export AUTOSCALER_DIR=${AUTOSCALER_ROOT}/terraform/gke
In this section you prepare your project for deployment.
-
Go to the project selector page in the Cloud Console. Select or create a Cloud project.
-
Make sure that billing is enabled for your Google Cloud project. Learn how to confirm billing is enabled for your project.
-
In Cloud Shell, configure the environment with the ID of your autoscaler project:
export PROJECT_ID=<INSERT_YOUR_PROJECT_ID> gcloud config set project ${PROJECT_ID}
-
Set the region where the Autoscaler resources will be created:
export REGION=us-central1
-
Enable the required Cloud APIs:
gcloud services enable iam.googleapis.com \ artifactregistry.googleapis.com \ cloudbuild.googleapis.com \ cloudresourcemanager.googleapis.com \ container.googleapis.com \ spanner.googleapis.com
-
If you want to create a new Spanner instance for testing the Autoscaler, set the following variable. The Spanner instance that Terraform creates is named
autoscale-test
.export TF_VAR_terraform_spanner_test=true
On the other hand, if you do not want to create a new Spanner instance because you already have an instance for the Autoscaler to monitor, set the name name of your instance in the following variable
export TF_VAR_spanner_name=<INSERT_YOUR_SPANNER_INSTANCE_NAME>
For more information on how to configure your Spanner instance to be managed by Terraform, see Importing your Spanner instances
-
There are two options for deploying the state store for the Autoscaler:
For Firestore, follow the steps in Using Firestore for Autoscaler State. For Spanner, follow the steps in Using Spanner for Autoscaler state.
-
To use Firestore for the Autoscaler state, choose the App Engine Location where the Autoscaler infrastructure will be created, for example:
export APP_ENGINE_LOCATION=us-central
-
Enable the additional APIs:
gcloud services enable \ appengine.googleapis.com \ firestore.googleapis.com
-
Create a Google App Engine app to enable the API for Firestore:
gcloud app create --region="${APP_ENGINE_LOCATION}"
-
To store the state of the Autoscaler, update the database created with the Google App Engine app to use Firestore native mode.
gcloud firestore databases update --type=firestore-native
You will also need to make a minor modification to the Autoscaler configuration. The required steps to do this are later in these instructions.
-
Next, continue to Deploying the Autoscaler
-
If you want to store the state in Cloud Spanner and you don't have a Spanner instance yet for that, then set the following variable so that Terraform creates an instance for you named
autoscale-test-state
:export TF_VAR_terraform_spanner_state=true
It is a best practice not to store the Autoscaler state in the same instance that is being monitored by the Autoscaler.
Optionally, you can change the name of the instance that Terraform will create:
export TF_VAR_spanner_state_name=<INSERT_STATE_SPANNER_INSTANCE_NAME>
If you already have a Spanner instance where state must be stored, only set the the name of your instance:
export TF_VAR_spanner_state_name=<INSERT_YOUR_STATE_SPANNER_INSTANCE_NAME>
If you want to manage the state of the Autoscaler in your own Cloud Spanner instance, please create the following table in advance:
CREATE TABLE spannerAutoscaler ( id STRING(MAX), lastScalingTimestamp TIMESTAMP, createdOn TIMESTAMP, updatedOn TIMESTAMP, ) PRIMARY KEY (id)
-
Next, continue to Deploying the Autoscaler
-
Set the project ID and region in the corresponding Terraform environment variables:
export TF_VAR_project_id=${PROJECT_ID} export TF_VAR_region=${REGION}
-
Change directory into the Terraform per-project directory and initialize it:
cd ${AUTOSCALER_DIR} terraform init
-
Create the Autoscaler infrastructure:
terraform plan -out=terraform.tfplan terraform apply -auto-approve terraform.tfplan
If you are running this command in Cloud Shell and encounter errors of the form
"Error: cannot assign requested address
", this is a
known issue in the Terraform Google provider, please retry
with -parallelism=1
.
Next, continue to Building and Deploying the Autoscaler Services.
If you have existing Spanner instances that you want to import to be managed by Terraform, follow the instructions in this section.
-
List your spanner instances
gcloud spanner instances list
-
Set the following variable with the instance name to import
SPANNER_INSTANCE_NAME=<YOUR_SPANNER_INSTANCE_NAME>
-
Create a Terraform config file with an empty
google_spanner_instance
resourceecho "resource \"google_spanner_instance\" \"${SPANNER_INSTANCE_NAME}\" {}" > "${SPANNER_INSTANCE_NAME}.tf"
-
Import the Spanner instance into the Terraform state.
terraform import "google_spanner_instance.${SPANNER_INSTANCE_NAME}" "${SPANNER_INSTANCE_NAME}"
-
After the import succeeds, update the Terraform config file for your instance with the actual instance attributes
terraform state show -no-color "google_spanner_instance.${SPANNER_INSTANCE_NAME}" \ | grep -vE "(id|num_nodes|state|timeouts).*(=|\{)" \ > "${SPANNER_INSTANCE_NAME}.tf"
If you have additional Spanner instances to import, repeat this process.
Importing Spanner databases is also possible using the
google_spanner_database
resource and following a
similar process.
-
To build the Autoscaler images and push them to Artifact Registry, run the following commands:
cd ${AUTOSCALER_ROOT} && \ gcloud builds submit poller --config=poller/cloudbuild.yaml --region=${REGION} && \ gcloud builds submit scaler --config=scaler/cloudbuild.yaml --region=${REGION}
-
Construct the paths to the images:
POLLER_PATH="${REGION}-docker.pkg.dev/${PROJECT_ID}/spanner-autoscaler/poller" SCALER_PATH="${REGION}-docker.pkg.dev/${PROJECT_ID}/spanner-autoscaler/scaler"
-
Retrieve the SHA256 hashes of the images:
POLLER_SHA=$(gcloud artifacts docker images describe ${POLLER_PATH}:latest --format='value(image_summary.digest)') SCALER_SHA=$(gcloud artifacts docker images describe ${SCALER_PATH}:latest --format='value(image_summary.digest)')
-
Construct the full paths to the images, including the SHA256 hashes:
POLLER_IMAGE="${POLLER_PATH}@${POLLER_SHA}" SCALER_IMAGE="${SCALER_PATH}@${SCALER_SHA}"
-
Retrieve the credentials for the cluster where the Autoscaler will be deployed:
gcloud container clusters get-credentials spanner-autoscaler --region=${REGION}
-
Next, to configure the Kubernetes manifests and deploy the Autoscaler to the cluster, run the following commands:
cd ${AUTOSCALER_ROOT}/kubernetes && \ kpt fn eval --image gcr.io/kpt-fn/apply-setters:v0.1.1 autoscaler-pkg -- poller_image=${POLLER_IMAGE} scaler_image=${SCALER_IMAGE} && \ kubectl apply -f autoscaler-pkg/ --recursive
The sample configuration creates two schedules to demonstrate autoscaling; a frequently running schedule to dynamically scale the Spanner instance according to utilization, and an hourly schedule to directly scale the Spanner instance every hour.
-
To prepare to configure the Autoscaler, run the following command:
for template in $(ls autoscaler-config/*.template) ; do envsubst < ${template} > ${template%.*} ; done
-
Next, to see how the Autoscaler is configured, run the following command to output the example configuration:
cat autoscaler-config/autoscaler-config*.yaml
These two files configure each instance of the autoscaler that you scheduled in the previous step. Notice the environment variable
AUTOSCALER_CONFIG
. You can use this variable to reference a configuration that will be used by that individual instance of the autoscaler. This means that you can configure multiple scaling schedules across multiple Spanner instances.If you do not supply this value, a default of
autoscaler-config.yaml
will be used.You can autoscale multiple Spanner instances on a single schedule by including multiple YAML stanzas in any of the scheduled configurations. For the schema of the configuration, see the [Poller configuration] autoscaler-config-params section.
-
If you have chosen to use Firestore to hold the Autoscaler state as described above, edit the above files, and remove the following lines:
stateDatabase: name: spanner instanceId: autoscale-test databaseId: spanner-autoscaler-state
Note: If you do not remove these lines, the Autoscaler will attempt to use the above non-existent Spanner database for its state store, which will result in the Poller component failing to start. Please see the Troubleshooting section for more details.
If you have chosen to use your own Spanner instance, please edit the above configuration files accordingly.
-
To configure the Autoscaler and begin scaling operations, run the following command:
kubectl apply -f autoscaler-config/
-
Any changes made to the configuration files and applied with
kubectl apply
will update the Autoscaler configuration. -
You can view logs for the Autoscaler components via
kubectl
or the Cloud Logging interface in the Google Cloud console.
This section contains guidance on what to do if you encounter issues when following the instructions above.
- Check there are no Organizational Policy rules that may conflict with cluster creation.
-
The first step if you are encountering scaling issues is to check the logs for the Autoscaler in Cloud Logging. To retrieve the logs for the
Poller
andScaler
components, use the following query:resource.type="k8s_container" resource.labels.namespace_name="spanner-autoscaler" resource.labels.container_name="poller" OR resource.labels.container_name="scaler"
If you do not see any log entries, check that you have selected the correct time period to display in the Cloud Logging console, and that the GKE cluster nodes have the correct permissions to write logs to the Cloud Logging API (roles/logging.logWriter).
-
If you have chosen to use Firestore for Autoscaler state and you see the following error in the logs:
Error: 5 NOT_FOUND: Database not found: projects/<YOUR_PROJECT>/instances/autoscale-test/databases/spanner-autoscaler-state
Edit the file
${AUTOSCALER_ROOT}/autoscaler-config/autoscaler-config.yaml
and remove the following stanza:stateDatabase: name: spanner instanceId: autoscale-test databaseId: spanner-autoscaler-state
-
Check the formatting of the YAML configration file:
cat ${AUTOSCALER_ROOT}/autoscaler-config/autoscaler-config.yaml