Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failing JMX tests #426

Merged
merged 6 commits into from
Nov 14, 2024
Merged

Fix failing JMX tests #426

merged 6 commits into from
Nov 14, 2024

Conversation

varunch77
Copy link
Member

@varunch77 varunch77 commented Nov 8, 2024

Description of the issue

We have failing JMX tests throughout out integration tests for multiple OSes. Here is a failed run that shows the test failing on these metrics in a previous run—specifically the garbage collection and tomcat metrics.

Description of changes

  • Extended agent run time from 2 to 5 minutes and set metrics_collection_interval to 60 seconds to better capture infrequent metrics (like jvm.gc.collections.elapsed and jvm.gc.collections.count)
  • Added java commands to enable the mbean registry to properly expose tomcat metrics
  • Some tests were failing because the average value of metrics fell slightly outside of the bounds (+/- 10%). I upped this to 15%.

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

I ran the integration tests in the amazon-cloudwatch-agent repo under the fix-JMX-integ-tests branch. That branch of the main repo is configured to run integration tests using this branch of the test repo. Here is the run: Run #11802640395

@varunch77 varunch77 marked this pull request as ready for review November 9, 2024 02:36
@varunch77 varunch77 requested a review from a team as a code owner November 9, 2024 02:36
@zhihonl
Copy link
Contributor

zhihonl commented Nov 11, 2024

nice findings!

@@ -46,7 +46,7 @@ func (t *JMXTomcatJVMTestRunner) GetAgentConfigFileName() string {
}

func (t *JMXTomcatJVMTestRunner) GetAgentRunDuration() time.Duration {
return 2 * time.Minute
return 10 * time.Minute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 10 minutes necessary here or can we reduce it to 5 minutes or less? or is that still too short to collect this metric. Asking since sleeping 10 minutes for a test to run slow down the testing process.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 5 is a good balance—will make this change and keep an eye on the tests

@zhihonl zhihonl self-requested a review November 12, 2024 00:53
@varunch77
Copy link
Member Author

Run #11749486690 (integration test run from original PR)
vs.
Run #11802640395 (includes commits to address the 10 -> 5 min change and metrics_collection_interval change)

Found 142 failing tests for 11749486690 | failure rate of 43%
Found 139 failing tests for 11802640395 | failure rate of 42%
Main run id:11749486690 Branch run id:11802640395

main_only:
EC2Linux                 | ./test/ca_bundle                                     | debian-12            | linux         | ec2_linux               | arm64                |
EC2Linux                 | ./test/ca_bundle                                     | ol8                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/cloudwatchlogs                                | debian-12            | linux         | ec2_linux               | arm64                |
EC2Linux                 | ./test/cloudwatchlogs                                | ol8                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/collection_interval                           | ol8                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/metric_dimension                              | ol8                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/metric_value_benchmark                        | debian-12            | linux         | ec2_linux               | arm64                |
EC2Linux                 | ./test/metric_value_benchmark                        | ol8                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/otlp                                          | ol8                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/restart                                       | ol8                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/run_as_user                                   | ol8                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/xray                                          | ol8                  | linux         | ec2_linux               | amd64                |

branch_only:
EC2DarwinIntegrationTest | ../../../test/feature/mac                            | macOS Monterey Arm64 | ec2_mac       | arm64                   | mac2.m...            |
EC2Linux                 | ./test/cloudwatchlogs                                | al2                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/cloudwatchlogs                                | ol7                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/metric_value_benchmark                        | al2                  | linux         | ec2_linux               | arm64                |
EC2Linux                 | ./test/metric_value_benchmark                        | sles-15              | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/xray                                          | sles-15              | linux         | ec2_linux               | amd64                |

both_branches:
EC2DarwinIntegrationTest | ../../../test/feature/mac                            | macOS Monterey Amd64 | ec2_mac       | amd64                   | mac1.m...            |
EC2Linux                 | ./test/amp                                           | al2                  | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/ca_bundle                                     | centos-stream-8      | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/ca_bundle                                     | rhel8                | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/ca_bundle                                     | sles-12              | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/cloudwatchlogs                                | centos-stream-8      | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/cloudwatchlogs                                | rhel8                | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/cloudwatchlogs                                | sles-12              | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/collection_interval                           | centos-stream-8      | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/collection_interval                           | rhel8                | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/collection_interval                           | sles-12              | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/metric_dimension                              | centos-stream-8      | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/metric_dimension                              | debian-12            | linux         | ec2_linux               | arm64                |
EC2Linux                 | ./test/metric_dimension                              | rhel8                | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/metric_dimension                              | sles-12              | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/metric_value_benchmark                        | centos-stream-8      | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/metric_value_benchmark                        | rhel8                | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/metric_value_benchmark                        | sles-12              | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/otlp                                          | centos-stream-8      | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/otlp                                          | debian-12            | linux         | ec2_linux               | arm64                |
EC2Linux                 | ./test/otlp                                          | rhel8                | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/otlp                                          | sles-12              | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/restart                                       | centos-stream-8      | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/restart                                       | rhel8                | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/restart                                       | sles-12              | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/run_as_user                                   | centos-stream-8      | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/run_as_user                                   | rhel8                | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/run_as_user                                   | sles-12              | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/xray                                          | centos-stream-8      | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/xray                                          | rhel8                | linux         | ec2_linux               | amd64                |
EC2Linux                 | ./test/xray                                          | sles-12              | linux         | ec2_linux               | amd64                |
EC2LinuxCN               | ./test/ca_bundle                                     | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxCN               | ./test/cloudwatchlogs                                | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxCN               | ./test/collection_interval                           | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxCN               | ./test/metric_dimension                              | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxCN               | ./test/metric_value_benchmark                        | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxCN               | ./test/otlp                                          | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxCN               | ./test/restart                                       | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxCN               | ./test/run_as_user                                   | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxCN               | ./test/xray                                          | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxITAR             | ./test/cloudwatchlogs                                | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxITAR             | ./test/collection_interval                           | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxITAR             | ./test/metric_dimension                              | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxITAR             | ./test/metric_value_benchmark                        | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxITAR             | ./test/otlp                                          | al2023               | linux         | ec2_linux               | arm64                |
EC2LinuxITAR             | ./test/xray                                          | al2023               | linux         | ec2_linux               | arm64                |
EC2WinIntegrationTest    | ../../../test/acceptance                             | win-11               | ec2_windows   | t3a.medium              | cloudwatch-agen...   |
EC2WinIntegrationTest    | ../../../test/acceptance                             | win-2016             | ec2_windows   | t3a.medium              | cloudwatch-ag...     |
EC2WinIntegrationTest    | ../../../test/acceptance                             | win-2019             | ec2_windows   | t3a.medium              | cloudwatch-ag...     |
EC2WinIntegrationTest    | ../../../test/feature/windows                        | win-11               | ec2_windows   | t3a.medium              | cloudwatch...        |
EC2WinIntegrationTest    | ../../../test/feature/windows                        | win-2016             | ec2_windows   | t3a.medium              | cloudwat...          |
EC2WinIntegrationTest    | ../../../test/feature/windows                        | win-2019             | ec2_windows   | t3a.medium              | cloudwat...          |
EC2WinIntegrationTest    | ../../../test/feature/windows                        | win-2022             | ec2_windows   | t3a.medium              | cloudwat...          |
EC2WinIntegrationTest    | ../../../test/feature/windows/custom_start/ssm_start | win-2019             | ec2_window... |
EC2WinIntegrationTest    | ../../../test/feature/windows/custom_start/userdata  | win-2019             | ec2_windows...|
EC2WinIntegrationTest    | ../../../test/feature/windows/event_logs             | win-11               | ec2_windows   | t3a.medium              | ...                  |
EC2WinIntegrationTest    | ../../../test/feature/windows/event_logs             | win-2016             | ec2_windows   | t3a.mediu...            |
EC2WinIntegrationTest    | ../../../test/feature/windows/event_logs             | win-2019             | ec2_windows   | t3a.mediu...            |
EC2WinIntegrationTest    | ../../../test/restart                                | win-11               | ec2_windows   | t3a.medium              | cloudwatch-agent-i...|
EC2WinIntegrationTest    | ../../../test/restart                                | win-2016             | ec2_windows   | t3a.medium              | cloudwatch-agent...  |
EC2WinIntegrationTest    | ../../../test/restart                                | win-2019             | ec2_windows   | t3a.medium              | cloudwatch-agent...  |
EC2WinStressTrackingTest | ../../test/stress/windows/system                     | win-2022             | windows       | ec2_windows_stress...   |
EKSIntegrationTest       | ./test/fluent                                        | eks_daemon           | amd64         | t3.medium               | AL2_x86_64           |
EKSIntegrationTest       | ./test/fluent                                        | eks_daemon           | arm64         | m6g.large               | AL2_ARM_64           |
EKSIntegrationTest       | ./test/metric_value_benchmark                        | eks_daemon           | amd64         | g4dn.xlarge             | AL2_x86_64           |
EKSIntegrationTest       | ./test/metric_value_benchmark                        | eks_daemon           | amd64         | t3.medium               | AL2_x86_64           |
GPU E2E Test             | ../../../../test/gpu                                 | eks_addon            | 0             | terraform/eks/addon/gpu | false                |
StressTrackingTest       | ../../test/stress/collectd                           | al2                  | linux         | ec2_stress              | amd64                |
StressTrackingTest       | ../../test/stress/emf                                | al2                  | linux         | ec2_stress              | amd64                |
StressTrackingTest       | ../../test/stress/logs                               | al2                  | linux         | ec2_stress              | amd64                |
StressTrackingTest       | ../../test/stress/statsd                             | al2                  | linux         | ec2_stress              | amd64                |
StressTrackingTest       | ../../test/stress/system                             | al2                  | linux         | ec2_stress              | amd64                |

@varunch77 varunch77 merged commit 93dedc0 into main Nov 14, 2024
2 checks passed
@varunch77 varunch77 deleted the fix-JMX-integ-tests branch November 14, 2024 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants