Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Documentation for Testing E2E Resource Changes #709

Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .github/workflows/appsignals-e2e-ec2-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,12 @@ env:
# It is not redundant
AWS_DEFAULT_REGION: ${{ inputs.aws-region }}
TEST_ACCOUNT: ${{ secrets.APP_SIGNALS_E2E_TEST_ACC }}
SAMPLE_APP_FRONTEND_SERVICE_JAR: "s3://aws-appsignals-sample-app/main-service.jar"
SAMPLE_APP_REMOTE_SERVICE_JAR: "s3://aws-appsignals-sample-app/remote-service.jar"
SAMPLE_APP_FRONTEND_SERVICE_JAR: ${{ secrets.APP_SIGNALS_E2E_FE_SA_JAR }}
SAMPLE_APP_REMOTE_SERVICE_JAR: ${{ secrets.APP_SIGNALS_E2E_RE_SA_JAR }}
APP_SIGNALS_ADOT_JAR: "https://github.com/aws-observability/aws-otel-java-instrumentation/releases/latest/download/aws-opentelemetry-agent.jar"
METRIC_NAMESPACE: AppSignals
LOG_GROUP_NAME: /aws/appsignals/generic
APP_SIGNALS_E2E_EC2_TEST_ROLE: ${{ secrets.APP_SIGNALS_E2E_EC2_TEST_ROLE }}

jobs:
e2e-ec2-test:
Expand All @@ -47,7 +48,6 @@ jobs:
else
echo APP_SIGNALS_CW_AGENT_RPM="https://amazoncloudwatch-agent-${{ env.AWS_DEFAULT_REGION }}.s3.${{ env.AWS_DEFAULT_REGION }}.amazonaws.com/amazon_linux/amd64/1.300031.0b313/amazon-cloudwatch-agent.rpm" >> $GITHUB_ENV
fi


- name: Generate testing id
run: echo TESTING_ID="${{ github.run_id }}-${{ github.run_number }}" >> $GITHUB_ENV
Expand Down Expand Up @@ -85,6 +85,7 @@ jobs:
-var="sample_remote_app_jar=${{ env.SAMPLE_APP_REMOTE_SERVICE_JAR }}" \
-var="cw_agent_rpm=${{ env.APP_SIGNALS_CW_AGENT_RPM }}" \
-var="adot_jar=${{ env.APP_SIGNALS_ADOT_JAR }}" \
-var="test_role=${{ env.APP_SIGNALS_E2E_EC2_TEST_ROLE }}" \
|| deployment_failed=$?

if [ $deployment_failed -eq 1 ]; then
Expand Down
34 changes: 20 additions & 14 deletions .github/workflows/appsignals-e2e-eks-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,24 @@ jobs:
with:
fetch-depth: 0

- name: Download enablement script
uses: actions/checkout@v4
with:
repository: aws-observability/application-signals-demo
ref: main
path: enablement-script
sparse-checkout: |
scripts/eks/appsignals/enable-app-signals.sh
scripts/eks/appsignals/clean-app-signals.sh
sparse-checkout-cone-mode: false

- name: Remove log group deletion command
if: always()
working-directory: enablement-script/scripts/eks/appsignals
run: |
delete_log_group="aws logs delete-log-group --log-group-name '${{ env.LOG_GROUP_NAME }}' --region \$REGION"
sed -i "s#$delete_log_group##g" clean-app-signals.sh

Comment on lines +45 to +62
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This should be a separate PR. We're mixing 2 different tasks here and we could've merged this small fix by now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing this change requires us to merge this PR first so I included it together.

- uses: actions/setup-java@v4
with:
java-version: 17
Expand Down Expand Up @@ -90,18 +108,6 @@ jobs:
with:
terraform_wrapper: false

# Enable App Signals on the test cluster
- name: Pull and unzip enablement script from S3
working-directory: testing/terraform/eks
run: aws s3 cp ${{ env.ENABLEMENT_SCRIPT_S3_BUCKET }} . && unzip -j onboarding.zip

- name: Remove log group deletion command
if: always()
working-directory: testing/terraform/eks
run: |
delete_log_group="aws logs delete-log-group --log-group-name '${{ env.LOG_GROUP_NAME }}' --region \$REGION"
sed -i "s#$delete_log_group##g" clean-app-signals.sh

- name: Deploy sample app via terraform and wait for the endpoint to come online
id: deploy-sample-app
working-directory: testing/terraform/eks
Expand Down Expand Up @@ -138,7 +144,7 @@ jobs:
# after installing App Signals. Attempts to connect will be made for up to 10 minutes
if [ $deployment_failed -eq 0 ]; then
echo "Installing app signals to the sample app"
./enable-app-signals.sh \
${GITHUB_WORKSPACE}/enablement-script/scripts/eks/appsignals/enable-app-signals.sh \
${{ inputs.test-cluster-name }} \
${{ env.AWS_DEFAULT_REGION }} \
${{ env.SAMPLE_APP_NAMESPACE }}
Expand Down Expand Up @@ -325,7 +331,7 @@ jobs:
- name: Clean Up App Signals
if: always()
continue-on-error: true
working-directory: testing/terraform/eks
working-directory: enablement-script/scripts/eks/appsignals
run: |
./clean-app-signals.sh \
${{ inputs.test-cluster-name }} \
Expand Down
95 changes: 95 additions & 0 deletions testing/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# How to Test E2E Resource Changes
This guide will give a step by step instruction on how to test changes made to E2E testing resources before pushing a PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be useful to add a message/list of what this guide will entail, or even a table of contents, e.g. "To test changes to the E2E tests, you will need to fork the repo, set up IAM roles and an EKS cluster, and build sample apps into ECRs/S3 buckets" and anything else of relevance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

### 1. Create an IAM Role with OIDC Identity Provider
This step is needed to allow Github Action to have access to resources in the AWS account
#### Create an OIDC Provider
- First step is to create an OIDC Identity Provider to allow Github action access to the AWS account resource. Login to AWS, go to the IAM console and click on the Identity Providers tab.
- Click on Add Provider, choose OpenID Connect and type `https://token.actions.githubusercontent.com` in the Provider URL. Click "Get thumbprint". For Audience, use `sts.amazonaws.com`. Finally, click "Add provider"
#### Create an IAM role
- Next, an IAM role needs to be created using the OIDC Identity Provider. Go to the Roles tab and click Create role.
- Choose Web Identity, and choose `token.actions.githubusercontent.com` as the Identity provider, Audience as `sts.amazonaws.com`, and for Github organizations put your github username down. Click next.
- Add the AdministratorAccess policy. Click next.
- Enter your Role name. Click "Create role".
Comment on lines +10 to +19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we do this programatically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. We also have to enter the console anyways to manually change the edit policy for the role so I think we can just make the programmer do this on the console?

#### Add Additional Permission
- After the role is created, search the role name in the roles tab, click on the role, and go to the Trust relationships tab. Click on "Edit trust policy".
- In the Statement list, add the following item:
`{
"Sid": "accessToRole",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<AccountID>:root"
},
"Action": "sts:AssumeRole"
}`. This additional permission is need to allow Github Action to assume roles and have access to the EKS cluster.

Additional Resource: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services

### 2. Create an EKS Cluster
The E2E EKS test uses an EKS cluster to deploy the sample apps.
#### Setup Environment with the Appropriate Roles and Permissions.
- First, assume Admin role from the test account by running `ada credentials update --account=<AccountID> --role=Admin --provider=isengard --once`
- Assume the e2e test role by running
- `output=$(aws sts assume-role --role-arn arn:aws:iam::<AccountID>:role/<E2ETestRole> --role-session-name AWSCLI-Session)`
- `export AWS_ACCESS_KEY_ID=$(echo $output | jq -r .Credentials.AccessKeyId)`
- `export AWS_SECRET_ACCESS_KEY=$(echo $output | jq -r .Credentials.SecretAccessKey)`
- `export AWS_SESSION_TOKEN=$(echo $output | jq -r .Credentials.SessionToken)`
- Run `aws sts get-caller-identity` to check if you are in the correct role
#### Create a new Cluster
- Next, create the cluster by running `eksctl create cluster --name <ClusterName> --region us-east-1 --zones us-east-1a,us-east-1b`. This will take around ~10 minutes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always point out the reader needs to replace something in the command before the command itself. Room for error here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

#### Install AWS Load Balancer Controller Add-on
- Finally, install the AWS Load Balancer Controller add-on by running the following commands. Make sure to replace the `<ClusterName>` and `<AccountID>` with the correct value.
- `eksctl utils associate-iam-oidc-provider --cluster <ClusterName> --region us-east-1 --approve`
- `curl -O https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.4.7/docs/install/iam_policy.json`
- `aws iam create-policy --policy-name AWSLoadBalancerControllerIAMPolicy --policy-document file://iam_policy.json --region us-east-1`
- `eksctl create iamserviceaccount --cluster=<ClusterName> --namespace=kube-system --name=aws-load-balancer-controller --attach-policy-arn=arn:aws:iam::<AccountID>:policy/AWSLoadBalancerControllerIAMPolicy --region us-east-1 --approve`
- `kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.5.4/cert-manager.yaml`
- `curl -Lo v2_4_7_full.yaml https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/download/v2.4.7/v2_4_7_full.yaml`
- `sed -i.bak -e '561,569d' ./v2_4_7_full.yaml`
- `sed -i.bak -e 's|your-cluster-name|<ClusterName>|' ./v2_4_7_full.yaml`
- `kubectl apply -f v2_4_7_full.yaml`
- `curl -Lo v2_4_7_ingclass.yaml https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/download/v2.4.7/v2_4_7_ingclass.yaml`
- `kubectl apply -f v2_4_7_ingclass.yaml`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a code block here instead of bullet points and inline code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


### 3. Setting up Environment for EC2 Tests
#### Create IAM Role for EC2 Instance
- Login to AWS, go to the IAM console and click on the Roles tab. Click Create role.
- Choose AWS service, and choose EC2 as the use case. Click Next.
- Choose AmazonS3ReadOnlyAccess, AWSXrayWriteOnlyAccess, and CloudWatchAgentServerPolicy as the permission.
- Type the role name and click "Create role".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ask them to name their role the same as we do. This would avoid the need for secret updates everywhere and a change in other repos I believe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CW Agent team has already added the secret for us and we just need to update the variables once this PR is merged. I think this change will make it more flexible and also hide our role name? Lmk

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Showing role name is not a big deal, and I think this will help reduce the work involved in running the test/having to update the other repo.

-
#### Setting Up Default VPC
- Go to the VPC console and on the routing table for the default VPC, click Edit routes.
- Click add routes, for destination add `0.0.0.0/0`, for target add Internet Gateway and save changes.
- Go to the Security groups tab, find the security group attached to the default VPC, click Edit inbound rules, choose type: All Traffic, Source: custom, and CIDR block: 0.0.0.0/0. Save rules.

### 4. Building Sample App to ECR
Create two ECR repositories: one for the sample app main service and another for the sample app remote service.
Follow the instructions under ./sample-apps/README.md to build the sample app image and upload it to the ECR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC there's a way to link between pages instead of giving the path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


### 5. Building Sample App to S3 Bucket
Create an S3 Bucket to store the .jar files for the sample app main service and sample app remote service.
Follow the instructions under ./sample-apps/README.md to build the sample app .jar and upload it to the bucket
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


### 6. Setting up repository
- Go to https://github.com/aws-observability/aws-otel-java-instrumentation and create a fork
- Go to the forked repo and enable action on the Action tab
- Add the following secrets to the repository
- APP_SIGNALS_E2E_TEST_ACC: `<AccountID>`
- E2E_TEST_ROLE_ARN: `arn:aws:iam::<AccountID>:role/<RoleName>`
- APP_SIGNALS_E2E_EC2_TEST_ROLE: <EC2_IAM_ROLE_NAME>
- APP_SIGNALS_E2E_FE_SA_IMG: `<AccountID>.dkr.ecr.us-east-1.amazonaws.com/<Path to Sample App Image>`
- APP_SIGNALS_E2E_RE_SA_IMG: `<AccountID>.dkr.ecr.us-east-1.amazonaws.com/<Path to Remote Sample App Image>`
- APP_SIGNALS_E2E_FE_SA_JAR: s3://<BucketName>/<FileName.jar>
- APP_SIGNALS_E2E_RE_SA_JAR: s3://<BucketName>/<FileName.jar>


### 7. Running the tests
Copy paste the test.yml into `../.github/workflows` and replace the cluster name with the one generated in step 2.
Push the code changes and there should be a test running on the forked repo in the Action tab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tester should just be able to run the canary on their test environment. Is that doable without having to make edits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that what we want? Rather than having a canary running on their forked repo isn't it better to just test change when we push new commit? Or do you mean we change the on: value to push on the canary file when we want to test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be best if the tester doesn't have to add files and then remove them after the fact. If we're setting up a test environment it should be made to be exactly like the real environment so that we can just run the same test in our accounts.


### E2E Testing Resources
- `./.github/workflows/appsignals-e2e-*`: workflow files for running e2e tests
- `./testing/sample-apps/*`: files for building the sample app
- `./testing/validator/*`: files for validating logs/metrics/traces generated by sample app
- `./testing/terraform/*`: files for launching the sample app to EKS cluster or EC2 instances
8 changes: 7 additions & 1 deletion testing/sample-apps/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,16 @@ Ensure that none of the repositories are currently using the image about to be u
To update the image, first push the update to a backup image (or generate a new one), then switch the address on the three repositories to the backup image one by one. Once all three repositories are pointing to
the backup image, push the update to the main image and revert the addresses on the repositories back to the original. Be careful to ensure the image names are appropriately stored in secrets.

### Setting up the environment:
1. Run the `patch.sh` script under `.github/scripts/` folder. You should have a new folder called `opentelemetry-java-instrumentation`
2. Cd to the new folder, then run `gradle publishToMavenLocal`
3. Run `rm -rf opentelemetry-java-instrumentation` to delete the folder.

### Steps to update image:
1. Use `ada` commands to autheticate into the testing account
2. Login to ECR Repository: `aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin {REPOSITORY}`
2. Login to ECR Repository: `aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin {REPOSITORY}`. Create a new repository first if the repository doesn't exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's needed, just make that step 1

3. Change repository name in the `build.gradle.kts` file under `testing/sample-apps/springboot` or `testing/sample-apps/sprintboot-remote-service`
4. Change the `tasks.named("jib").enabled` value on the `build.gradle.kts` file from false to true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it set to false in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it was changed a while back: 862695e

I'm assuming the sample app gradle file was interfering when the ADOT team tries to build their images.

4. Run `gradle jib` under the respective directory.

## [WIP] EC2 Use Case: Building the JAR Files
Expand Down
6 changes: 3 additions & 3 deletions testing/terraform/ec2/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ locals {
}

data "aws_ami" "ami" {
owners = ["amazon"]
owners = ["amazon"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Unnecessary white space change

most_recent = true
filter {
name = "name"
Expand Down Expand Up @@ -66,7 +66,7 @@ resource "aws_instance" "main_service_instance" {
ami = data.aws_ami.ami.id # Amazon Linux 2 (free tier)
instance_type = "t2.micro"
key_name = local.ssh_key_name
iam_instance_profile = "APP_SIGNALS_EC2_TEST_ROLE"
iam_instance_profile = var.test_role
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this immediately break release testing in CWA and CWA Operator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CWA Operator doesn't use EC2, but it will break CWA

vpc_security_group_ids = [aws_default_vpc.default.default_security_group_id]
associate_public_ip_address = true
instance_initiated_shutdown_behavior = "terminate"
Expand Down Expand Up @@ -127,7 +127,7 @@ resource "aws_instance" "remote_service_instance" {
ami = data.aws_ami.ami.id # Amazon Linux 2 (free tier)
instance_type = "t2.micro"
key_name = local.ssh_key_name
iam_instance_profile = "APP_SIGNALS_EC2_TEST_ROLE"
iam_instance_profile = var.test_role
vpc_security_group_ids = [aws_default_vpc.default.default_security_group_id]
associate_public_ip_address = true
instance_initiated_shutdown_behavior = "terminate"
Expand Down
4 changes: 4 additions & 0 deletions testing/terraform/ec2/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -39,4 +39,8 @@ variable "cw_agent_rpm" {

variable "adot_jar" {
default = "s3://<bucket-name>/<jar>"
}

variable "test_role" {
default = "<role-name>"
}
24 changes: 24 additions & 0 deletions testing/test.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this file is necessary; as I mentioned in a previous comment, the tester should be able to run the canary on their account/test environment.

Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
## This workflow aims to run the Application Signals end-to-end tests to test changes
## on the E2E testing resources before submitting a PR. The e2e-eks-test will deploy a
## sample app on an EKS cluster, call the APIs, and validate the generated telemetry
## including logs, metrics, and traces while the e2e-ec2-test will do the same but on
## two EC2 instances.
name: Test
on:
push:

jobs:
e2e-eks-test:
uses: ./.github/workflows/appsignals-e2e-eks-test.yml
secrets: inherit
with:
aws-region: us-east-1
test-cluster-name: <Enter Cluster Name>
caller-workflow-name: 'test'

e2e-ec2-test:
uses: ./.github/workflows/appsignals-e2e-ec2-test.yml
secrets: inherit
with:
aws-region: us-east-1
caller-workflow-name: 'test'
Loading