Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ftr/obs/alerts] refactor to avoid stale-element errors #140427

Merged

Conversation

spalger
Copy link
Contributor

@spalger spalger commented Sep 9, 2022

Fixes #140248

Stale element exceptions are caused by fetching a reference to a browser element and then interacting with it some time later, after that element no longer exists in the DOM.

This can happen with code like this:

// fetch a reference to the btn
const btn = await testSubjects.find('...')
// click the button via reference, if the element was removed from the DOM between these two calls this will throw a `StaleElementReferenceError`
await btn.click();

To avoid this the testSubjects service exposes several methods that allow operating on elements in common ways without capturing a reference to the element itself, and if a StaleElementReferenceError occurs internally the relevant operation can be retried automatically. To make functional testing code resilient to these StaleElementReferenceErrors we should instead write the above code like this:

// let the testSubjects service make sure the click is delivered and retry on `StaleElementReferenceError`
await testSubjects.click('...')

Additionally, there is a retryOnStale() helper function available. This function simply wraps an async function and calls it up to 10 times, returning it's result as long as it doesn't throw a StaleElementReferenceError. This can be useful for more complex functions like the ones in the ObservabilityAlertsCommonProvider, where element references are used in abundance and refactoring to use compound functions from the testSubjects service isn't easy.

In order to address #140248 I moved the retryOnStale helper function to a service so it could be easily accessed by the ObservabilityAlertsCommonProvider. Then, I either converted the call to use the compound helper functions from testSubjects or wrapped an existing method with the helper so that they will automatically retry if a StaleElementReferenceError is thrown like in #140248.

@spalger spalger force-pushed the implement/obs-alerts-common-avoid-stale-el branch from bfa3c31 to 6de4cb7 Compare September 9, 2022 15:42
@spalger spalger added Team:Operations Team label for Operations Team release_note:skip Skip the PR/issue when compiling release notes labels Sep 9, 2022
@spalger spalger marked this pull request as ready for review September 9, 2022 17:34
@spalger spalger requested a review from a team as a code owner September 9, 2022 17:34
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

Copy link
Member

@jasonrhodes jasonrhodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM from the observability changes side. @spenceralger can you point us to the best place to go to better understand retryOnStale.wrap and the testSubjects.* methods as safer alternatives to something like return await (await getQuerySubmitButton()).click(); ?

@spalger
Copy link
Contributor Author

spalger commented Sep 9, 2022

@jasonrhodes Honestly, there isn't a great place to learn about the testSubjects service, and the retryOnStale.wrap() function is something I wrote recently and is until this point undocumented. I'll add a comprehensive functional testing guide to our documentation matrix so that we won't forget to get to it eventually, but at this point the best docs we have are at https://www.elastic.co/guide/en/kibana/current/development-tests.html#development-functional-tests and then of course there's always the source code and the types.

(BTW, I'm @spalger)

@jasonrhodes
Copy link
Member

@spalger ok sounds good, we'll look there. And I know your GH tag, no idea why my fingers just decided to write spencer 🤷

@spalger spalger requested review from a team as code owners September 9, 2022 22:29
@spalger
Copy link
Contributor Author

spalger commented Sep 10, 2022

@maryam-saeidi
Copy link
Member

@spalger Interesting finding, I like to help you with documenting best practices for functional tests and potentially check the observability functional tests to make sure they match the best practices. Let me know how I can help.

@cnasikas
Copy link
Member

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Member

@dmlemeshko dmlemeshko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dmlemeshko
Copy link
Member

dmlemeshko commented Sep 12, 2022

Just a note: unfortunately testSubjects.click() still won't keep it safe of StaleElementExcetion in 100%, though it guarantees the 'new' reference to element is used for actions like click/type/etc. It is important to have a mechanism to properly wait for page loading before the findElement call, so that element won't be updated in DOM when it is least expected. I think we should explain it in the docs.

@spalger spalger merged commit d42e0e1 into elastic:main Sep 12, 2022
@spalger spalger deleted the implement/obs-alerts-common-avoid-stale-el branch September 12, 2022 13:04
@kibanamachine kibanamachine added v8.5.0 backport:skip This commit does not require backporting labels Sep 12, 2022
gergoabraham added a commit that referenced this pull request Oct 24, 2024
## Summary

closes #173184 
closes #173441
closes #196003

This PR tries to improve on the `StaleElementReferenceError` happening
in Endpoint Exception tests.

This error is thrown if an element has already been removed from the DOM
when trying to perform an action on it. For some reference, see
#140427

Improvements:
- the part that was failing is wrapped inside the `retryOnStale` helper:
602f229
**note:** actually the test fails have started in December, 2023, but
the line where the fail was in the last test runs were added in May,
2024 (#183471). unfortunately, the
log artifacts from 2023 are already removed from Buildkite, so no
certainty on what happened back then
- another suspicious part was wrapped as well:
ec8c5cf and
e5245ad
- and as an extra, wait for page load:
7cd867f

flaky 50/50 ✅ but this doesn't mean much, as this issue happens quite
rarely ¯\\(◉‿◉)/¯


### Checklist

Delete any items that are not applicable to this PR.

- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 24, 2024
…#197457)

## Summary

closes elastic#173184
closes elastic#173441
closes elastic#196003

This PR tries to improve on the `StaleElementReferenceError` happening
in Endpoint Exception tests.

This error is thrown if an element has already been removed from the DOM
when trying to perform an action on it. For some reference, see
elastic#140427

Improvements:
- the part that was failing is wrapped inside the `retryOnStale` helper:
602f229
**note:** actually the test fails have started in December, 2023, but
the line where the fail was in the last test runs were added in May,
2024 (elastic#183471). unfortunately, the
log artifacts from 2023 are already removed from Buildkite, so no
certainty on what happened back then
- another suspicious part was wrapped as well:
ec8c5cf and
e5245ad
- and as an extra, wait for page load:
7cd867f

flaky 50/50 ✅ but this doesn't mean much, as this issue happens quite
rarely ¯\\(◉‿◉)/¯

### Checklist

Delete any items that are not applicable to this PR.

- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
(cherry picked from commit f151e2c)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 24, 2024
…#197457)

## Summary

closes elastic#173184
closes elastic#173441
closes elastic#196003

This PR tries to improve on the `StaleElementReferenceError` happening
in Endpoint Exception tests.

This error is thrown if an element has already been removed from the DOM
when trying to perform an action on it. For some reference, see
elastic#140427

Improvements:
- the part that was failing is wrapped inside the `retryOnStale` helper:
602f229
**note:** actually the test fails have started in December, 2023, but
the line where the fail was in the last test runs were added in May,
2024 (elastic#183471). unfortunately, the
log artifacts from 2023 are already removed from Buildkite, so no
certainty on what happened back then
- another suspicious part was wrapped as well:
ec8c5cf and
e5245ad
- and as an extra, wait for page load:
7cd867f

flaky 50/50 ✅ but this doesn't mean much, as this issue happens quite
rarely ¯\\(◉‿◉)/¯

### Checklist

Delete any items that are not applicable to this PR.

- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
(cherry picked from commit f151e2c)
kibanamachine added a commit that referenced this pull request Oct 24, 2024
…197457) (#197662)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR
(#197457)](#197457)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Gergő
Ábrahám","email":"gergo.abraham@elastic.co"},"sourceCommit":{"committedDate":"2024-10-24T14:27:08Z","message":"[EDR
Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n##
Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses
#196003\r\n\r\nThis PR tries to improve on the
`StaleElementReferenceError` happening\r\nin Endpoint Exception
tests.\r\n\r\nThis error is thrown if an element has already been
removed from the DOM\r\nwhen trying to perform an action on it. For some
reference,
see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n-
the part that was failing is wrapped inside the `retryOnStale`
helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:**
actually the test fails have started in December, 2023, but\r\nthe line
where the fail was in the last test runs were added in May,\r\n2024
(#183471). unfortunately,
the\r\nlog artifacts from 2023 are already removed from Buildkite, so
no\r\ncertainty on what happened back then\r\n- another suspicious part
was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4
and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra,
wait for page
load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅
but this doesn't mean much, as this issue happens quite\r\nrarely
¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are
not applicable to this PR.\r\n\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:Defend
Workflows","backport:prev-major"],"title":"[EDR Workflows] Unskip and
fix flaky endpoint exceptions
FTR","number":197457,"url":"https://github.com/elastic/kibana/pull/197457","mergeCommit":{"message":"[EDR
Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n##
Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses
#196003\r\n\r\nThis PR tries to improve on the
`StaleElementReferenceError` happening\r\nin Endpoint Exception
tests.\r\n\r\nThis error is thrown if an element has already been
removed from the DOM\r\nwhen trying to perform an action on it. For some
reference,
see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n-
the part that was failing is wrapped inside the `retryOnStale`
helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:**
actually the test fails have started in December, 2023, but\r\nthe line
where the fail was in the last test runs were added in May,\r\n2024
(#183471). unfortunately,
the\r\nlog artifacts from 2023 are already removed from Buildkite, so
no\r\ncertainty on what happened back then\r\n- another suspicious part
was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4
and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra,
wait for page
load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅
but this doesn't mean much, as this issue happens quite\r\nrarely
¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are
not applicable to this PR.\r\n\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/197457","number":197457,"mergeCommit":{"message":"[EDR
Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n##
Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses
#196003\r\n\r\nThis PR tries to improve on the
`StaleElementReferenceError` happening\r\nin Endpoint Exception
tests.\r\n\r\nThis error is thrown if an element has already been
removed from the DOM\r\nwhen trying to perform an action on it. For some
reference,
see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n-
the part that was failing is wrapped inside the `retryOnStale`
helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:**
actually the test fails have started in December, 2023, but\r\nthe line
where the fail was in the last test runs were added in May,\r\n2024
(#183471). unfortunately,
the\r\nlog artifacts from 2023 are already removed from Buildkite, so
no\r\ncertainty on what happened back then\r\n- another suspicious part
was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4
and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra,
wait for page
load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅
but this doesn't mean much, as this issue happens quite\r\nrarely
¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are
not applicable to this PR.\r\n\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d"}}]}]
BACKPORT-->

Co-authored-by: Gergő Ábrahám <gergo.abraham@elastic.co>
kibanamachine added a commit that referenced this pull request Oct 24, 2024
…197457) (#197661)

# Backport

This will backport the following commits from `main` to `8.16`:
- [[EDR Workflows] Unskip and fix flaky endpoint exceptions FTR
(#197457)](#197457)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Gergő
Ábrahám","email":"gergo.abraham@elastic.co"},"sourceCommit":{"committedDate":"2024-10-24T14:27:08Z","message":"[EDR
Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n##
Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses
#196003\r\n\r\nThis PR tries to improve on the
`StaleElementReferenceError` happening\r\nin Endpoint Exception
tests.\r\n\r\nThis error is thrown if an element has already been
removed from the DOM\r\nwhen trying to perform an action on it. For some
reference,
see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n-
the part that was failing is wrapped inside the `retryOnStale`
helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:**
actually the test fails have started in December, 2023, but\r\nthe line
where the fail was in the last test runs were added in May,\r\n2024
(#183471). unfortunately,
the\r\nlog artifacts from 2023 are already removed from Buildkite, so
no\r\ncertainty on what happened back then\r\n- another suspicious part
was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4
and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra,
wait for page
load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅
but this doesn't mean much, as this issue happens quite\r\nrarely
¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are
not applicable to this PR.\r\n\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:Defend
Workflows","backport:prev-major"],"title":"[EDR Workflows] Unskip and
fix flaky endpoint exceptions
FTR","number":197457,"url":"https://github.com/elastic/kibana/pull/197457","mergeCommit":{"message":"[EDR
Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n##
Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses
#196003\r\n\r\nThis PR tries to improve on the
`StaleElementReferenceError` happening\r\nin Endpoint Exception
tests.\r\n\r\nThis error is thrown if an element has already been
removed from the DOM\r\nwhen trying to perform an action on it. For some
reference,
see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n-
the part that was failing is wrapped inside the `retryOnStale`
helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:**
actually the test fails have started in December, 2023, but\r\nthe line
where the fail was in the last test runs were added in May,\r\n2024
(#183471). unfortunately,
the\r\nlog artifacts from 2023 are already removed from Buildkite, so
no\r\ncertainty on what happened back then\r\n- another suspicious part
was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4
and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra,
wait for page
load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅
but this doesn't mean much, as this issue happens quite\r\nrarely
¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are
not applicable to this PR.\r\n\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/197457","number":197457,"mergeCommit":{"message":"[EDR
Workflows] Unskip and fix flaky endpoint exceptions FTR (#197457)\n\n##
Summary\r\n\r\ncloses #173184 \r\ncloses #173441\r\ncloses
#196003\r\n\r\nThis PR tries to improve on the
`StaleElementReferenceError` happening\r\nin Endpoint Exception
tests.\r\n\r\nThis error is thrown if an element has already been
removed from the DOM\r\nwhen trying to perform an action on it. For some
reference,
see\r\nhttps://github.com//pull/140427\r\n\r\nImprovements:\r\n-
the part that was failing is wrapped inside the `retryOnStale`
helper:\r\n602f2294fddb9bee8b69ebf2fd8382e9f025d59d\r\n**note:**
actually the test fails have started in December, 2023, but\r\nthe line
where the fail was in the last test runs were added in May,\r\n2024
(#183471). unfortunately,
the\r\nlog artifacts from 2023 are already removed from Buildkite, so
no\r\ncertainty on what happened back then\r\n- another suspicious part
was wrapped as well:\r\nec8c5cfd94812c8e5b357e00aac8bfae93ceecf4
and\r\ne5245ad010a02527105a56973465a25feb52ec85\r\n- and as an extra,
wait for page
load:\r\n7cd867fcb9489b24e79066dce750a2381af93d7d\r\n\r\nflaky 50/50 ✅
but this doesn't mean much, as this issue happens quite\r\nrarely
¯\\\\(◉‿◉)/¯\r\n\r\n\r\n### Checklist\r\n\r\nDelete any items that are
not applicable to this PR.\r\n\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"f151e2ccaa55cc5e13740f49e88c323c0e1d8f6d"}}]}]
BACKPORT-->

Co-authored-by: Gergő Ábrahám <gergo.abraham@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Operations Team label for Operations Team v8.5.0
Projects
None yet
9 participants