Add tests for Metric Rollup Addition of [RemoteService, RemoteTarget, ...] #729

jj22ee · 2024-01-25T18:57:19Z

Issue #, if available:
aws/amazon-cloudwatch-agent#1010
Recently the following metric rollups were added to 2 AppSignal Configuration files. Tests need to be added for them

[RemoteService, RemoteTarget]
[HostedIn.<Attributes>, Service, RemoteService, RemoteTarget]

Description of changes:
Add tests for new metric rollups

Testing done:
EKS in IAD: https://github.com/jj22ee/amazon-cloudwatch-agent/actions/runs/7659009903/job/20873170967
EC2 in IAD: https://github.com/jj22ee/amazon-cloudwatch-agent/actions/runs/7659098851/job/20873465899

./gradlew testing:validator:test

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…service

codecov-commenter · 2024-01-25T19:17:47Z

Codecov Report

Attention: 102 lines in your changes are missing coverage. Please review.

Comparison is base (09e6487) 85.71% compared to head (4204d00) 50.87%.
Report is 233 commits behind head on main.

Files	Patch %	Lines
...ent/providers/AwsAppSignalsCustomizerProvider.java	24.00%	35 Missing and 3 partials ⚠️
...gent/providers/AwsSpanMetricsProcessorBuilder.java	0.00%	20 Missing ⚠️
...ders/AttributePropagatingSpanProcessorBuilder.java	0.00%	16 Missing ⚠️
...viders/AwsMetricAttributesSpanExporterBuilder.java	0.00%	11 Missing ⚠️
...try/javaagent/providers/AwsSpanProcessingUtil.java	90.16%	1 Missing and 5 partials ⚠️
...vaagent/providers/AwsMetricAttributeGenerator.java	96.89%	2 Missing and 3 partials ⚠️
...y/javaagent/providers/AwsSpanMetricsProcessor.java	91.48%	0 Missing and 4 partials ⚠️
...t/providers/AttributePropagatingSpanProcessor.java	94.59%	2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@              Coverage Diff              @@
##               main     #729       +/-   ##
=============================================
- Coverage     85.71%   50.87%   -34.84%     
- Complexity       19      266      +247     
=============================================
  Files             3       39       +36     
  Lines            49     1315     +1266     
  Branches          5      144      +139     
=============================================
+ Hits             42      669      +627     
- Misses            3      614      +611     
- Partials          4       32       +28

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

majanjua-amzn

Overall seems to be working but there's some design questions and investigations into the unintended effects of the way the logic works to check.

Also, do not merge this code until CW Agent performs their release, otherwise you will cause canary to break leading to sev 2.

majanjua-amzn · 2024-01-25T20:37:40Z

testing/validator/src/main/java/com/amazon/aoc/validators/CWMetricValidator.java

+                  new Pair<>(CloudWatchService.REMOTE_SERVICE_DIMENSION, "AWS.SDK.S3"),
+                  new Pair<>(CloudWatchService.REMOTE_TARGET_DIMENSION, "e2e-test-bucket-name")));
+
+          // Populate actualMetricList with each set of dimension filters


Suggested change

// Populate actualMetricList with each set of dimension filters

// Populate actualMetricList with metrics that pass through one of the dimension filters

majanjua-amzn · 2024-01-25T20:39:38Z

testing/validator/src/main/java/com/amazon/aoc/validators/CWMetricValidator.java

-              expectedMetricList,
-              actualMetricList);
+
+          // Add sets of dimesion filters to use for each query to CloudWatch


We've now added a second edge case to this logic, i.e. the [RemoteService] and [RemoteService, RemoteTarget] aggregations. It would be worth adding a larger comment here explaining what's happening and explain why in the PR description for future reference.

Please be more descriptive in other comments in your PR as well.

majanjua-amzn · 2024-01-25T20:50:53Z

testing/validator/src/main/java/com/amazon/aoc/validators/CWMetricValidator.java

+          for (String remoteServiceName : remoteServiceNames) {
+            dimensionLists.add(
+                Arrays.asList(
+                    new Pair<>(CloudWatchService.REMOTE_SERVICE_DIMENSION, remoteServiceName)));
+          }


Is this new logic able to pull [RemoteService, RemoteTarget] without pulling [<other stuff>, RemoteService, RemoteTarget]? If yes, we can add back the other remote service values to remoteServiceNames. If not, the code changes in this PR will cause flaky tests in the long run.

This is due to the fact that when trying to pull [RemoteService, RemoteTarget] if you also pull other aggregations that have the same two attributes you will pull metrics from all test runs in the past three hours that match the filter you set. CW listMetrics API cannot return all of these metrics reliably and we've seen this cause failures before. Please ensure your solution accounts for this by running a few tests in a row and seeing if metrics with attributes containing the wrong test ID appear in the actualMetricList.

majanjua-amzn · 2024-01-25T20:51:51Z

testing/validator/src/main/java/com/amazon/aoc/validators/CWMetricValidator.java

+          dimensionLists.add(
+              Arrays.asList(
+                  new Pair<>(CloudWatchService.REMOTE_SERVICE_DIMENSION, "AWS.SDK.S3"),
+                  new Pair<>(CloudWatchService.REMOTE_TARGET_DIMENSION, "e2e-test-bucket-name")));


Once the metric exists, how do we know whether the test that just occurred is the one that populated it or if it simply exists from a previous test run? It's entirely possible that a test from 2 hours ago created the metric and then all tests since then did not appropriately populate this metric but because of our .withRecentlyActive("PT3H") we are not finding out at all that this metric is flakily populated.

This is the same issue we had with trying to check [RemoteService] when it is equal to ["www.amazon.com"] for example. (There are also other considerations for this aggregation which I'll mention in another comment)

Note: withRecentlyActive() only accepts PT3H

jj22ee · 2024-01-26T01:00:39Z

In discussion, a solution for both issues ([RemoteService, RemoteTarget] pulling in too much data which comes from previous test runs and can validate the current test run) is to update the aws-sdk-call api to take in an identifier (like Pod_IP) and use that in the bucket name (e.g. e2e-test-bucket-name-<Pod_IP>). Then the tests will need to specify this ID in queries that include RemoteTarget dimension.

github-actions · 2024-03-31T20:06:08Z

This PR is stale because it has been open 60 days with no activity.

jj22ee · 2024-04-16T16:34:41Z

Closing in favor of: aws-observability/aws-application-signals-test-framework#41

jj22ee added 3 commits January 25, 2024 10:35

Add tests for new metric rollup update with remote target and remote …

5852a3e

…service

unit tests

222a7fc

gradlew spotlessApply

4204d00

jj22ee requested a review from a team as a code owner January 25, 2024 18:57

majanjua-amzn requested changes Jan 25, 2024

View reviewed changes

jj22ee mentioned this pull request Mar 1, 2024

Update Sample Apps aws-sdk-call API to optionally uniquely name S3 Buckets aws-observability/aws-application-signals-test-framework#7

Merged

github-actions bot added the stale label Mar 31, 2024

jj22ee closed this Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for Metric Rollup Addition of [RemoteService, RemoteTarget, ...] #729

Add tests for Metric Rollup Addition of [RemoteService, RemoteTarget, ...] #729

jj22ee commented Jan 25, 2024

codecov-commenter commented Jan 25, 2024

majanjua-amzn left a comment

majanjua-amzn Jan 25, 2024

majanjua-amzn Jan 25, 2024

majanjua-amzn Jan 25, 2024

majanjua-amzn Jan 25, 2024

majanjua-amzn Jan 25, 2024

jj22ee commented Jan 26, 2024

github-actions bot commented Mar 31, 2024

jj22ee commented Apr 16, 2024 •

edited

Loading

	// Populate actualMetricList with each set of dimension filters
	// Populate actualMetricList with metrics that pass through one of the dimension filters

Add tests for Metric Rollup Addition of [RemoteService, RemoteTarget, ...] #729

Add tests for Metric Rollup Addition of [RemoteService, RemoteTarget, ...] #729

Conversation

jj22ee commented Jan 25, 2024

codecov-commenter commented Jan 25, 2024

Codecov Report

majanjua-amzn left a comment

Choose a reason for hiding this comment

majanjua-amzn Jan 25, 2024

Choose a reason for hiding this comment

majanjua-amzn Jan 25, 2024

Choose a reason for hiding this comment

majanjua-amzn Jan 25, 2024

Choose a reason for hiding this comment

majanjua-amzn Jan 25, 2024

Choose a reason for hiding this comment

majanjua-amzn Jan 25, 2024

Choose a reason for hiding this comment

jj22ee commented Jan 26, 2024

github-actions bot commented Mar 31, 2024

jj22ee commented Apr 16, 2024 • edited Loading

jj22ee commented Apr 16, 2024 •

edited

Loading