This directory contains the infrastructure-as-code CDK solution for deploying an OpenSearch Domain as well as the infrastructure for the Migration solution. Users have the ability to easily deploy their infrastructure using default values or provide configuration options for a more customized setup. The goal of this repo is not to become a one-size-fits-all solution for users- rather, this code base should be viewed as a starting point for users to use and add to individually as their custom use case requires.
There are certain requirements for imported and created VPCs used in this solution that come from its dependencies on other AWS services such as MSK(Kafka), OpenSearch Service, and ECS. These requirements should be met before using an imported VPC with this solution.
- At least 2 private subnets should be available which span at least 2 Availability Zones
Docker is used by CDK to build container images. If not installed, follow the steps here to set up. Later versions are recommended.
Git is used by the opensearch-migrations repo to fetch associated repositories. Steps to set up can be found here.
Java is used by the opensearch-migrations repo and Gradle, its associated build tool. The current required version is Java 11.
1- It is necessary to run npm install
within this current directory to install required packages that this app and CDK need for operation.
2- Creating Dockerfiles, this project needs to build the required Dockerfiles that the CDK will use in its services. These Dockerfiles can be built by running the below script which is located in the same directory as this README
./buildDockerImages.sh
More details can be found here
3- Configure the desired AWS credentials, as these will dictate the region and account used for deployment.
4- There is a known issue where service linked roles fail to get applied when deploying certain AWS services for the first time in an account. This can be resolved by simply deploying again (for each failing role) or avoided entirely by creating the service linked role initially like seen below:
aws iam create-service-linked-role --aws-service-name opensearchservice.amazonaws.com; aws iam create-service-linked-role --aws-service-name ecs.amazonaws.com
If this is your first experience with CDK, follow the steps below to get started:
1- Install the CDK CLI tool, if you haven't already, by running:
npm install -g aws-cdk
2- Bootstrap CDK: if you have not run CDK previously in the configured region of you account, it is necessary to run the following command from the same directory as this README to set up a small CloudFormation stack of resources that CDK needs to function within your account
# Execute from the deployment/cdk/opensearch-service-migration directory
cdk bootstrap --c contextId=demo-deploy
Further CDK documentation here
Update the file named cdk.context.json
in this directory to select migration options for metadata, historical backfill or traffic capture and replay, see details here.
This project uses CDK context parameters to configure deployments. These context values will dictate the composition of your stacks as well as which stacks get deployed.
The full list of available configuration options for this project are listed here. Each option can be provided as an empty string ""
or simply not included, and in each of these 'empty' cases the option will use the project default value (if it exists) or CloudFormation's default value.
Depending on your use-case, you may choose to provide options from both the cdk.context.json
and the CDK CLI, in which case it is important to know the precedence level for context values. The below order shows these levels with values being passed by the CDK CLI having the most importance
- CDK CLI passed context values, e.g. --c stage=dev2 (highest precedence)
- Created
cdk.context.json
in the same directory as this README - Existing
default-values.json
in the same directory as this README
A set of demo context values (using the demo-deploy
label) has been set in the cdk.context.json
located in this directory, which can be customized or used as is for a quickstart demo solution.
This demo solution can be deployed with the following command:
cdk deploy "*" --c contextId=demo-deploy --require-approval never --concurrency 3
Additionally, another context block in the cdk.context.json
could be created with say the label uat-deploy
with its own custom context configuration and be deployed with the command:
cdk deploy "*" --c contextId=uat-deploy --require-approval never --concurrency 3
Note: Separate deployments within the same account and region should use unique stage
context values to avoid resource naming conflicts when deploying (Except in the multiple replay scenario stated here)
Stacks can also be redeployed individually, with any required stacks also being deployed initially, e.g. the following command would deploy the migration-console stack
cdk deploy "migration-console" --c contextId=demo-deploy
To get a list of all the available stack ids that can be deployed/redeployed for a particular contextId
:
cdk ls --c contextId=demo-deploy
See the wiki for steps on how to use this tooling to perform different migrations.
The Migration Console is a deployed ECS service container in this solution that should be accessed for managing/executing different phases of a migration
To open a shell on the Migration Console container execute the below command. If the SSM Session Manager plugin is not installed, it should be installed when prompted from the below exec command.
# ./accessContainer.sh <service-name> <stage> <region>
./accessContainer.sh migration-console dev us-east-1
To be able to execute this command the user will need to have their AWS credentials configured for their environment. It is also worth noting that the identity used should have permissions to ecs:ExecuteCommand
on both the ECS cluster and the given container task, for which an Administrator or Deployment role may already have. An example IAM policy that could be added, which allows accessing any of the containers on the ECS cluster, can be seen below:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "ecs:ExecuteCommand",
"Resource": [
"arn:aws:ecs:<REGION>:<ACCOUNT-ID>:cluster/migration-<STAGE>-ecs-cluster",
"arn:aws:ecs:<REGION>:<ACCOUNT-ID>:task/migration-<STAGE>-ecs-cluster/*"
]
}
]
}
To remove all the CDK stack(s) which get created during a deployment we can execute a command similar to below
# For demo context
cdk destroy "*" --c contextId=demo-deploy
Or to remove an individual stack from the deployment we can execute
# For demo context
cdk destroy migration-console --c contextId=demo-deploy
Note: The demo-deploy
contextId has the retention policy for the OpenSearch Domain set to DESTROY
, which will remove this resource and all its data when the stack is deleted. In order to retain the Domain on stack deletion the domainRemovalPolicy
would need to be set to RETAIN
.
The project supports running distinct Replayers in parallel, with each Replayer sending traffic to a different target cluster. This functionality allows users to test replaying captured traffic to multiple different target clusters in parallel. Users are able to provide the desired configuration options to spin up a new OpenSearch Domain and Traffic Replayer while using the existing Migration infrastructure that has already been deployed.
To give an example of this process, a user could decide to configure an additional Replayer and Domain for the demo setup in the cdk.context.json
by configuring a new context block like below. Note: addOnMigrationDeployId
is a required field to allow proper naming of these additional resources.
"demo-addon1": {
"addOnMigrationDeployId": "demo-addon1",
"stage": "dev",
"engineVersion": "OS_1.3",
"domainName": "demo-cluster-1-3",
"dataNodeCount": 2,
"openAccessPolicyEnabled": true,
"domainRemovalPolicy": "DESTROY",
"enableDemoAdmin": true,
"trafficReplayerServiceEnabled": true
}
And then deploy this additional infrastructure with the command:
cdk deploy "*" --c contextId=demo-addon1 --require-approval never --concurrency 3
Finally, the additional infrastructure can be removed with:
cdk destroy "*" --c contextId=demo-addon1
The Replayer documentation here explains the reasoning the Replayer uses to determine what auth header it should use when replaying requests to the target cluster.
As it relates to this CDK, the targetCluster
configuration option (specifically the auth
element) that a user provides will dictate which auth the Migration tools will use for communicating with the target cluster
Problem:
ERROR: failed to solve: public.ecr.aws/sam/build-nodejs18.x: pulling from host public.ecr.aws failed with status code [manifests latest]: 403 Forbidden
Resolution:
Follow the warning steps here: https://docs.aws.amazon.com/AmazonECR/latest/public/public-registries.html#public-registry-concepts
If you've previously authenticated to Amazon ECR Public, if your auth token has expired you may receive an authentication error when attempting to do unauthenticated docker pulls from Amazon ECR Public.
To resolve this issue, it may be necessary to run docker logout public.ecr.aws to avoid the error. This will result in an unauthenticated pull.
Problem:
error TS2345: Argument of type '{ clusterName: string; vpc: IVpc; }' is not assignable to parameter of type 'ClusterAttributes'.
Resolution:
An older version of the CDK CLI is being used which required additional ClusterAttributes. This error should resolve upon updating the CDK CLI
npm install -g aws-cdk@latest
npm run build
compile typescript to jsnpm run watch
watch for changes and compilenpm run test
perform the jest unit testscdk ls
list all stacks in the appcdk deploy "*"
deploy all stacks to your default AWS account/regioncdk diff
compare deployed stack with current statecdk synth
emits the synthesized CloudFormation template