-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ftr/obs/alerts] refactor to avoid stale-element errors #140427
[ftr/obs/alerts] refactor to avoid stale-element errors #140427
Conversation
bfa3c31
to
6de4cb7
Compare
6de4cb7
to
6b9fa5f
Compare
…lerts-common-avoid-stale-el
This reverts commit b0b9b58.
Pinging @elastic/kibana-operations (Team:Operations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM from the observability changes side. @spenceralger can you point us to the best place to go to better understand retryOnStale.wrap
and the testSubjects.*
methods as safer alternatives to something like return await (await getQuerySubmitButton()).click();
?
@jasonrhodes Honestly, there isn't a great place to learn about the (BTW, I'm @spalger) |
@spalger ok sounds good, we'll look there. And I know your GH tag, no idea why my fingers just decided to write spencer 🤷 |
…lerts-common-avoid-stale-el
200 successful builds in https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/1198, running another 200 in https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/1199 |
@spalger Interesting finding, I like to help you with documenting best practices for functional tests and potentially check the observability functional tests to make sure they match the best practices. Let me know how I can help. |
@elasticmachine merge upstream |
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Just a note: unfortunately |
## Summary closes #173184 closes #173441 closes #196003 This PR tries to improve on the `StaleElementReferenceError` happening in Endpoint Exception tests. This error is thrown if an element has already been removed from the DOM when trying to perform an action on it. For some reference, see #140427 Improvements: - the part that was failing is wrapped inside the `retryOnStale` helper: 602f229 **note:** actually the test fails have started in December, 2023, but the line where the fail was in the last test runs were added in May, 2024 (#183471). unfortunately, the log artifacts from 2023 are already removed from Buildkite, so no certainty on what happened back then - another suspicious part was wrapped as well: ec8c5cf and e5245ad - and as an extra, wait for page load: 7cd867f flaky 50/50 ✅ but this doesn't mean much, as this issue happens quite rarely ¯\\(◉‿◉)/¯ ### Checklist Delete any items that are not applicable to this PR. - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
…#197457) ## Summary closes elastic#173184 closes elastic#173441 closes elastic#196003 This PR tries to improve on the `StaleElementReferenceError` happening in Endpoint Exception tests. This error is thrown if an element has already been removed from the DOM when trying to perform an action on it. For some reference, see elastic#140427 Improvements: - the part that was failing is wrapped inside the `retryOnStale` helper: 602f229 **note:** actually the test fails have started in December, 2023, but the line where the fail was in the last test runs were added in May, 2024 (elastic#183471). unfortunately, the log artifacts from 2023 are already removed from Buildkite, so no certainty on what happened back then - another suspicious part was wrapped as well: ec8c5cf and e5245ad - and as an extra, wait for page load: 7cd867f flaky 50/50 ✅ but this doesn't mean much, as this issue happens quite rarely ¯\\(◉‿◉)/¯ ### Checklist Delete any items that are not applicable to this PR. - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit f151e2c)
…#197457) ## Summary closes elastic#173184 closes elastic#173441 closes elastic#196003 This PR tries to improve on the `StaleElementReferenceError` happening in Endpoint Exception tests. This error is thrown if an element has already been removed from the DOM when trying to perform an action on it. For some reference, see elastic#140427 Improvements: - the part that was failing is wrapped inside the `retryOnStale` helper: 602f229 **note:** actually the test fails have started in December, 2023, but the line where the fail was in the last test runs were added in May, 2024 (elastic#183471). unfortunately, the log artifacts from 2023 are already removed from Buildkite, so no certainty on what happened back then - another suspicious part was wrapped as well: ec8c5cf and e5245ad - and as an extra, wait for page load: 7cd867f flaky 50/50 ✅ but this doesn't mean much, as this issue happens quite rarely ¯\\(◉‿◉)/¯ ### Checklist Delete any items that are not applicable to this PR. - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit f151e2c)
…197457) (#197662) # Backport This will backport the following commits from `main` to `8.x`: - [[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)](#197457) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Gergő Ábrahám","email":"gergo.abraham@elastic.co"},"sourceCommit":{"committedDate":"2024-10-24T14:27:08Z","message":"[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n## Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses #196003\r\n\r\nThis PR tries to improve on the `StaleElementReferenceError` happening\r\nin Endpoint Exception tests.\r\n\r\nThis error is thrown if an element has already been removed from the DOM\r\nwhen trying to perform an action on it. For some reference, see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n- the part that was failing is wrapped inside the `retryOnStale` helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:** actually the test fails have started in December, 2023, but\r\nthe line where the fail was in the last test runs were added in May,\r\n2024 (#183471). unfortunately, the\r\nlog artifacts from 2023 are already removed from Buildkite, so no\r\ncertainty on what happened back then\r\n- another suspicious part was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4 and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra, wait for page load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅ but this doesn't mean much, as this issue happens quite\r\nrarely ¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:Defend Workflows","backport:prev-major"],"title":"[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR","number":197457,"url":"https://github.com/elastic/kibana/pull/197457","mergeCommit":{"message":"[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n## Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses #196003\r\n\r\nThis PR tries to improve on the `StaleElementReferenceError` happening\r\nin Endpoint Exception tests.\r\n\r\nThis error is thrown if an element has already been removed from the DOM\r\nwhen trying to perform an action on it. For some reference, see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n- the part that was failing is wrapped inside the `retryOnStale` helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:** actually the test fails have started in December, 2023, but\r\nthe line where the fail was in the last test runs were added in May,\r\n2024 (#183471). unfortunately, the\r\nlog artifacts from 2023 are already removed from Buildkite, so no\r\ncertainty on what happened back then\r\n- another suspicious part was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4 and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra, wait for page load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅ but this doesn't mean much, as this issue happens quite\r\nrarely ¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/197457","number":197457,"mergeCommit":{"message":"[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n## Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses #196003\r\n\r\nThis PR tries to improve on the `StaleElementReferenceError` happening\r\nin Endpoint Exception tests.\r\n\r\nThis error is thrown if an element has already been removed from the DOM\r\nwhen trying to perform an action on it. For some reference, see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n- the part that was failing is wrapped inside the `retryOnStale` helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:** actually the test fails have started in December, 2023, but\r\nthe line where the fail was in the last test runs were added in May,\r\n2024 (#183471). unfortunately, the\r\nlog artifacts from 2023 are already removed from Buildkite, so no\r\ncertainty on what happened back then\r\n- another suspicious part was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4 and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra, wait for page load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅ but this doesn't mean much, as this issue happens quite\r\nrarely ¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d"}}]}] BACKPORT--> Co-authored-by: Gergő Ábrahám <gergo.abraham@elastic.co>
…197457) (#197661) # Backport This will backport the following commits from `main` to `8.16`: - [[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)](#197457) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Gergő Ábrahám","email":"gergo.abraham@elastic.co"},"sourceCommit":{"committedDate":"2024-10-24T14:27:08Z","message":"[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n## Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses #196003\r\n\r\nThis PR tries to improve on the `StaleElementReferenceError` happening\r\nin Endpoint Exception tests.\r\n\r\nThis error is thrown if an element has already been removed from the DOM\r\nwhen trying to perform an action on it. For some reference, see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n- the part that was failing is wrapped inside the `retryOnStale` helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:** actually the test fails have started in December, 2023, but\r\nthe line where the fail was in the last test runs were added in May,\r\n2024 (#183471). unfortunately, the\r\nlog artifacts from 2023 are already removed from Buildkite, so no\r\ncertainty on what happened back then\r\n- another suspicious part was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4 and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra, wait for page load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅ but this doesn't mean much, as this issue happens quite\r\nrarely ¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:Defend Workflows","backport:prev-major"],"title":"[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR","number":197457,"url":"https://github.com/elastic/kibana/pull/197457","mergeCommit":{"message":"[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n## Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses #196003\r\n\r\nThis PR tries to improve on the `StaleElementReferenceError` happening\r\nin Endpoint Exception tests.\r\n\r\nThis error is thrown if an element has already been removed from the DOM\r\nwhen trying to perform an action on it. For some reference, see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n- the part that was failing is wrapped inside the `retryOnStale` helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:** actually the test fails have started in December, 2023, but\r\nthe line where the fail was in the last test runs were added in May,\r\n2024 (#183471). unfortunately, the\r\nlog artifacts from 2023 are already removed from Buildkite, so no\r\ncertainty on what happened back then\r\n- another suspicious part was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4 and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra, wait for page load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅ but this doesn't mean much, as this issue happens quite\r\nrarely ¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/197457","number":197457,"mergeCommit":{"message":"[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n## Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses #196003\r\n\r\nThis PR tries to improve on the `StaleElementReferenceError` happening\r\nin Endpoint Exception tests.\r\n\r\nThis error is thrown if an element has already been removed from the DOM\r\nwhen trying to perform an action on it. For some reference, see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n- the part that was failing is wrapped inside the `retryOnStale` helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:** actually the test fails have started in December, 2023, but\r\nthe line where the fail was in the last test runs were added in May,\r\n2024 (#183471). unfortunately, the\r\nlog artifacts from 2023 are already removed from Buildkite, so no\r\ncertainty on what happened back then\r\n- another suspicious part was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4 and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra, wait for page load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅ but this doesn't mean much, as this issue happens quite\r\nrarely ¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d"}}]}] BACKPORT--> Co-authored-by: Gergő Ábrahám <gergo.abraham@elastic.co>
Fixes #140248
Stale element exceptions are caused by fetching a reference to a browser element and then interacting with it some time later, after that element no longer exists in the DOM.
This can happen with code like this:
To avoid this the
testSubjects
service exposes several methods that allow operating on elements in common ways without capturing a reference to the element itself, and if aStaleElementReferenceError
occurs internally the relevant operation can be retried automatically. To make functional testing code resilient to theseStaleElementReferenceError
s we should instead write the above code like this:Additionally, there is a
retryOnStale()
helper function available. This function simply wraps an async function and calls it up to 10 times, returning it's result as long as it doesn't throw aStaleElementReferenceError
. This can be useful for more complex functions like the ones in theObservabilityAlertsCommonProvider
, where element references are used in abundance and refactoring to use compound functions from thetestSubjects
service isn't easy.In order to address #140248 I moved the
retryOnStale
helper function to a service so it could be easily accessed by theObservabilityAlertsCommonProvider
. Then, I either converted the call to use the compound helper functions fromtestSubjects
or wrapped an existing method with the helper so that they will automatically retry if aStaleElementReferenceError
is thrown like in #140248.