Redshift Smart Pause and Resume

Open source tool to automatically pause and resume Redshift (single and multi-node) clusters using AWS Lambda, CloudWatch Metrics and Events, Amazon Forecast and Step Functions.

About

Resuming and Pausing Redshift using Cluster CPU utilisation Metrics from CloudWatch

CPU utilisation data from an existing Redshift data warehouse is scraped from Cloudwatch metrics. In particular, the metric is the average CPU utilisation at 15 minute intervals by default (value is configurable: recommended values are 5, 15, 30, 60 minute intervals). The data scraped is then used to train an Amazon forecast model, and the resulting forecast predictions are used to determine when to pausea and resume a Redshift data warehouse cluster.

A threshold value is set and used determine when to pause and resume a Redshift cluster. To illustrate, if a threshold value of 5% (i.e., 5% CPU utilisation) is set, the Reshift cluster will be scheduled to resume around a timestamp when CPU utilisation is forecasted to be over the threshold value. On the other hand, given the same threshold value, a Reshift cluster will be scheduled pause around a timestamp when CPU utilisation is forecasted to be under the threshold value.

An example of this is showcased below. For this example, given the forecasted CPU utilisation values:

The Redshift cluster will be scheduled to resume on 7:45
The Redshift cluster will be scheduled to pause on 21:15
A buffer of 30 minutes is subtracted and added from the actual timestamps observed to give ample time for the Redshift cluster to resume and pause, respectively.

AWS Serverless Architecture

This tool in a nutshell consists to 2 step functions: (1) the Train Forecast Model Step Function and (2) the Generate Forecasts Step Function. Both of these step functions are executed using Lambda functions, and these Lambda functions are triggered with scheduled Cloudwatch Events. Events are scheduled based on the timezone specified when deploying the tool.

Train Forecast Model Step Function

The following step function consists of a number of steps aimed to produce an Amazon forecast model predicting Redshift CPU utilisation. This step function can be scheduled to run more frequently depending on how often Redshift utilisation activity pattern changes. If AutoML is enabled the most appropriate forecast model will be fitted to the provided dataset. By default this step function is scheduled to run on the first day of each month at 9:00 (Time based on the provided timezone).

Generate Forecasts Step Function

The following step function consists of a number of steps aimed to produce Amazon forecast predictions using the resulting model trained from the Train Forecast Model Step Function. This step function is scheduled to run daily. In particular, Redshift metrics data from the previous day will be scraped and used alongside existing data to generate Redshift CPU utilisation forecasts for the following (current) day. Forecasts are then used to determine when to pause and resume the Redshift cluster. By default this step function is scheduled to run everyday at 12:05 midnight (Time is based on the provided timezone).

Setup

Deploy

Install Serverless Framework

npm install serverless

Install AWS CLI

pip3 install awscli

Configure AWS CLI following instructions found here. Ensure that the user configured has the appropriate IAM permissions to create Lambda Functions, S3 Buckets, IAM Roles, Step functions, Amazon Forecast resources and CloudFormation Stacks.

Install Redshift Smart Pause and Resume

serverless create --template-url https://github.com/servian/aws-redshift-smart-pause-and-resume --path aws-redshift-smart-pause-and-resume

Change to Redshift Smart Pause and Resume directory

cd aws-redshift-smart-pause-and-resume

Install Serverless Plugins

serverless plugin install --name serverless-python-requirements
serverless plugin install --name serverless-iam-roles-per-function
serverless plugin install --name serverless-pseudo-parameters
serverless plugin install --name serverless-local-schedule

Deploy service to AWS Account. The option redshiftclusterid is required and need to be specified upon deploying the tool. (See Deployment Options below for more details on the other options provided).

NOTE: If deploying the tool to schedule another Redshift cluster, ensure that a different value is set for the option servicename. The value for this option is set to smart-sched by default. Possible value is smart-sched-01.

serverless deploy \
 [--region <AWS region>] \
 [--aws-profile <AWS CLI profile>] \
 [--redshiftclusterid <AWS redshift cluster id>]
 [--stage <deployment environment>] \
 [--servicename <tool service/stack name>] # default value is smart-sched

After deploying the tool, run the following command to initially scrape for data and train the forecast model. (See Scraping and Training Forecast Model After Deployment below for more details on the options provided).

NOTE: When deploying the tool, if a value for the option servicename was specified and if this value is different from the default value smart-sched, be sure to provide the exact same value for cfnstackname.

python3 local_scrape_and_train.py 
[--awsprofile <value>] \
[--numdaystoscrape <value>] \
[--cfnstackname  <value>] \ # value must be consitent with servicename from Deploy Step 6
[--stage  <value>] # value must be consistent with stage from Deploy Step 6

Update

Remove existing library
Install/recreate Redshift Smart Pause and Resume

serverless create --template-url https://github.com/servian/aws-redshift-smart-pause-and-resume --path aws-redshift-smart-pause-and-resume

Change to Redshift Smart Pause and Resume directory

cd aws-redshift-smart-pause-and-resume

Redeploy service to AWS Account. The option redshiftclusterid is required and need to be specified upon deploying the tool. (See Deployment Options below for more details on the other options provided).

NOTE: If updating the tool to schedule another Redshift cluster, ensure that the value set for the option servicename is consistent to the value when the tool was first deployed. The value for this option is set to smart-sched by default. Possible value is smart-sched-01, if this is the value used when deploying the tool to another Redshift cluster.

serverless deploy \
 [--region <AWS region>] \
 [--aws-profile <AWS CLI profile>] \
 [--redshiftclusterid <AWS redshift cluster id>]
 [--stage <deployment environment>] \
 [--servicename <tool service/stack name>] # default value is smart-sched

Remove

Change to Redshift Smart Pause and Resume directory

cd aws-redshift-smart-pause-and-resume

Remove Service from AWS Account

serverless remove \
 [--region <AWS region>] \
 [--aws-profile <AWS CLI profile>] \
 [--redshiftclusterid <AWS redshift cluster id>]
 [--stage <deployment environment>] \
 [--servicename <tool service/stack name>] # default value is smart-sched

Deployment Options

serverless deploy \
[--aws-profile <value>] \
[--region <value>] \
[--stage <value>] \
[--redshiftclusterid <value>] \
[--servicename <value>] \
[--enableautoml <value>] \
[--algorithmarn <value>] \
[--timezone <value>] \

--aws-profile (string)

AWS Profile to deploy resources (default value: default)

--region (string)

AWS Region to deploy resources (default value: ap-southeast-2: Sydney Region)

--stage (string)

deployment environment suffix (default value: dev)

--redshiftclusterid (string: REQUIRED)

unique identifier of the redshift cluster to enable smart scheduling

--servicename (string)

unique identifier for the tool stack (default value: smart-sched)

--enableautoml (string)

possible values ENABLED or DISABLED (default: DISABLED)

--algorithmarn (string)

Possible values (default: arn:aws:forecast:::algorithm/ARIMA)

--timezone (string)

Possible values (default: Australia/Melbourne)

--intervalminutes (int)

granularity of average Redshift CPU utilisation to use throughout stack (default: 15)

Scraping Data and Training Amazon Forecast Model After Deployment

The following script scrapes Redshift CPU utilisation data, and uses it to initially train an Amazon Forecast Model after deploying the stack by executing the Train Forecast Model Step Function.

python3 local_scrape_and_train.py \
[--awsprofile <value>] \
[--numdaystoscrape <value>] \
[--cfnstackname  <value>] \ # value must be consitent with servicename from Deploy Step 6
[--stage  <value>] # value must be consistent with stage from Deploy Step 6

--awsprofile (string)

AWS Profile to deploy resources (default value: default)

--numdaystoscrape (string)

number of days (from previous day) worth of Redshift CPU utilisation data to scrape (default value: 14)

--cfnstackname (string)

cloudformation stack name (default value: smart-sched. Which is service-name in serverless.yml template)

--stage (string)

environment suffix (default value: dev)

References

Automating your Amazon Forecast Workflow with Lambda, Step Functions and Cloudwatch Events Rule

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.vscode		.vscode
auto_forecast		auto_forecast
auto_redshift		auto_redshift
auto_scrape		auto_scrape
auto_trigger		auto_trigger
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
local_scrape_and_train.py		local_scrape_and_train.py
package-lock.json		package-lock.json
package.json		package.json
serverless.yml		serverless.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Redshift Smart Pause and Resume

About

Resuming and Pausing Redshift using Cluster CPU utilisation Metrics from CloudWatch

AWS Serverless Architecture

Train Forecast Model Step Function

Generate Forecasts Step Function

Setup

Deploy

Update

Remove

Deployment Options

Scraping Data and Training Amazon Forecast Model After Deployment

References

About

Releases

Packages

Contributors 3

Languages

License

servian/aws-redshift-smart-pause-and-resume

Folders and files

Latest commit

History

Repository files navigation

Redshift Smart Pause and Resume

About

Resuming and Pausing Redshift using Cluster CPU utilisation Metrics from CloudWatch

AWS Serverless Architecture

Train Forecast Model Step Function

Generate Forecasts Step Function

Setup

Deploy

Update

Remove

Deployment Options

Scraping Data and Training Amazon Forecast Model After Deployment

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages