Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cherry Pick] Enable canary report generation #634

Merged
merged 46 commits into from
Mar 30, 2023
Merged
Show file tree
Hide file tree
Changes from 45 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
8a2eefe
increase cert-manager wait time for kubeflow-issuer to be install
jsitu777 Mar 3, 2023
fbf91ac
Merge branch 'release-v1.6.1-aws-b1.0.1' of https://github.com/awslab…
jsitu777 Mar 24, 2023
0e973cf
enable canary report
jsitu777 Mar 24, 2023
76f1bb6
modify container name
jsitu777 Mar 24, 2023
8c3662d
change docker command
jsitu777 Mar 25, 2023
3d97489
change docker command
jsitu777 Mar 25, 2023
5e657e3
print out container file structure to debug
jsitu777 Mar 25, 2023
ae0609e
print out container file structure to debug
jsitu777 Mar 25, 2023
ebd48b4
debug container
jsitu777 Mar 25, 2023
8c6efd9
fix canary.buildspec.yaml
jsitu777 Mar 25, 2023
045b498
fix canary.buildspec.yaml
jsitu777 Mar 25, 2023
4ca26b4
fix canary.buildspec.yaml
jsitu777 Mar 25, 2023
ff4c386
push to ECR after cp
jsitu777 Mar 25, 2023
b5f838d
push to ECR after cp
jsitu777 Mar 25, 2023
bc81c59
run container in detached mode so it's still running after build
jsitu777 Mar 25, 2023
e4bd53b
sleep 30s to ensure container still running
jsitu777 Mar 25, 2023
77bd0a7
remove sleep
jsitu777 Mar 25, 2023
3be38a8
push to cloudwatch
jsitu777 Mar 29, 2023
147debd
minor fix
jsitu777 Mar 29, 2023
02db0e1
bug fixes
jsitu777 Mar 29, 2023
fadb8e7
rename test log to xml
jsitu777 Mar 29, 2023
10350bf
add dimensions to cloudwatch metrics
jsitu777 Mar 29, 2023
76a15e4
use trap to make sure it python executes if pytest fails
jsitu777 Mar 29, 2023
a8891c1
use list_project for codebuild
jsitu777 Mar 29, 2023
09104dc
fix datetime module error
jsitu777 Mar 29, 2023
fc4cd2d
remove region as cloudwatch metric dimension
jsitu777 Mar 29, 2023
7c8b247
keep only total test, failure, success metric to push to cloud watch
jsitu777 Mar 29, 2023
d2e4595
run pytest first then trap python script
jsitu777 Mar 29, 2023
fd2154b
add int type for success
jsitu777 Mar 30, 2023
868da1e
get codebuild project name from env instead of boto3
jsitu777 Mar 30, 2023
7111dc4
change success into int type
jsitu777 Mar 30, 2023
bd1d95a
intent to test fail for debugging
jsitu777 Mar 30, 2023
112ec17
call python script inside a function
jsitu777 Mar 30, 2023
abd90bc
run push to cloudwatch onerror
jsitu777 Mar 30, 2023
c9a6132
modify to exit instead of on error
jsitu777 Mar 30, 2023
bebf9b0
add set -e in buildspec
jsitu777 Mar 30, 2023
350b322
to ensure docker always exit with 0
jsitu777 Mar 30, 2023
92dc0b2
bug fix
jsitu777 Mar 30, 2023
b81c41c
remove || true after docker run, hard code codebuild project name for…
jsitu777 Mar 30, 2023
95ab554
restore kf installation and uninstallation in fixture
jsitu777 Mar 30, 2023
3a61765
test if cloudwatch can push metrics if test fails
jsitu777 Mar 30, 2023
25aa392
teest if can get env variable directly
jsitu777 Mar 30, 2023
5e930d3
restore kf installation and notebook test
jsitu777 Mar 30, 2023
2e395b4
add || true to docker cp command so next command will continue even i…
jsitu777 Mar 30, 2023
fe09879
rename cloudwatch metric namespace
jsitu777 Mar 30, 2023
f703021
move trap code at the beginning; remove getting cluster_name env
jsitu777 Mar 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions tests/canary/canary.buildspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ phases:
# Get cached test image
- aws ecr get-login-password --region $CLUSTER_REGION | docker login --username AWS --password-stdin $ECR_CACHE_URI || true
- docker pull ${ECR_CACHE_URI}:latest --quiet || true

# Build test image
- >
docker build -f ./tests/canary/Dockerfile.canary . -t ${ECR_CACHE_URI}:latest --quiet
Expand All @@ -15,8 +14,15 @@ phases:
commands:
# Run tests
- docker run --name kf-distro-canary $(env | cut -f1 -d= | sed 's/^/-e /') --mount type=bind,source="$(pwd)/",target="/kubeflow-manifests/" ${ECR_CACHE_URI}:latest

# Push test image to cache ECR repo
post_build:
commands:
- docker cp kf-distro-canary:/kubeflow-manifests/tests/canary/integration_tests.xml /tmp/results.xml || true
# Push test image to cache ECR repo
- docker push ${ECR_CACHE_URI}:latest || true


reports:
IntegrationTestReport:
files:
- "results.xml"
base-directory: "/tmp"

92 changes: 92 additions & 0 deletions tests/canary/scripts/push_stats_to_cloudwatch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
import boto3
from datetime import datetime
import xml.etree.ElementTree as ET
import os


xml_path = "../canary/integration_tests.xml"


def readXML_and_publish_metrics_to_cw():
if os.path.isfile(xml_path):
tree = ET.parse(xml_path)
testsuite = tree.find("testsuite")
failures = testsuite.attrib["failures"]
tests = testsuite.attrib["tests"]
successes = int(tests) - int(failures)
else:
failures = 0
successes = 0
tests = 1

timestamp = datetime.now().strftime("%Y-%m-%dT%H:%M:%S")

print(f"Failures: {failures}")
print(f"Total tests: {tests}")
print(f"Success: {successes}")

# push to cloudwatch
cw_client = boto3.client("cloudwatch")
project_name = "CodeBuild-Run-All-Tests"

cluster_name = os.getenv("CLUSTER_NAME")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove

print(f"CLUSTER_NAME: {cluster_name}")
# Define the metric data
metric_data = [
{
"MetricName": "failures",
"Timestamp": timestamp,
"Dimensions": [
{"Name": "CodeBuild Project Name", "Value": project_name},
],
"Value": int(failures),
"Unit": "Count",
},
{
"MetricName": "total_tests",
"Timestamp": timestamp,
"Dimensions": [
{"Name": "CodeBuild Project Name", "Value": project_name},
],
"Value": int(tests),
"Unit": "Count",
},
{
"MetricName": "successes",
"Timestamp": timestamp,
"Dimensions": [
{"Name": "CodeBuild Project Name", "Value": project_name},
],
"Value": int(successes),
"Unit": "Count",
},
]

# Use the put_metric_data method to push the metric data to CloudWatch
try:
response = cw_client.put_metric_data(
Namespace="Canary_Metrics", MetricData=metric_data
)
if response["ResponseMetadata"]["HTTPStatusCode"] == 200:
print("Successfully pushed data to CloudWatch")
# return 200 status code if successful
return 200
else:
# raise exception if the status code is not 200
raise Exception(
"Unexpected response status code: {}".format(
response["ResponseMetadata"]["HTTPStatusCode"]
)
)
except Exception as e:
print("Error pushing data to CloudWatch: {}".format(e))
# raise exception if there was an error pushing data to CloudWatch
raise

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format using black. couple of extra lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatted with black


def main():
readXML_and_publish_metrics_to_cw()


if __name__ == "__main__":
main()
12 changes: 6 additions & 6 deletions tests/canary/scripts/run_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,6 @@
# Script configuration
set -euo pipefail

function onError {
echo "Run test FAILED. Exiting."
}
trap onError ERR

export CANARY_TEST_DIR=${REPO_PATH}/tests/canary
export E2E_TEST_DIR=${REPO_PATH}/tests/e2e
Expand All @@ -31,6 +27,10 @@ mkdir -p $E2E_TEST_DIR/.metadata/
cp metadata-canary $E2E_TEST_DIR/.metadata/

cd $E2E_TEST_DIR
pytest tests/test_sanity_portforward.py -s -q --metadata .metadata/metadata-canary --region $CLUSTER_REGION --installation_option $INSTALLATION_OPTION

function push_to_cloudwatch {
echo "Pushing Codebuild stats to Cloudwatch."
python ../canary/scripts/push_stats_to_cloudwatch.py
}
trap push_to_cloudwatch EXIT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this code above line 13 since failure can happen in any command

pytest tests/test_sanity_portforward.py -s -q --metadata .metadata/metadata-canary --region $CLUSTER_REGION --installation_option $INSTALLATION_OPTION --junitxml ../canary/integration_tests.xml

1 change: 0 additions & 1 deletion tests/e2e/fixtures/installation.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,6 @@ def on_create():
install_kubeflow(installation_option, deployment_option, cluster)

def on_delete():

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: revert? or coming from black

uninstall_kubeflow(installation_option, deployment_option)


Expand Down