-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix QueueResizableOpenSearchThreadPoolExecutorTests #18006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix QueueResizableOpenSearchThreadPoolExecutorTests #18006
Conversation
|
❌ Gradle check result for 7e471e1: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for a6990d9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #18006 +/- ##
============================================
- Coverage 72.52% 72.50% -0.02%
+ Complexity 67163 67138 -25
============================================
Files 5473 5473
Lines 310092 310094 +2
Branches 45060 45061 +1
============================================
- Hits 224899 224840 -59
- Misses 66814 66891 +77
+ Partials 18379 18363 -16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ashking94
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a race condition in testResizeQueueDown() where depending on random parameters we could submit up to 1002 tasks into an executor with a queue size of 900. That introduced a race condition where if the tasks didn't execute fast enough then a rejected execution exception could happen and fail the test.
I am unable to see on what fixed this flakiness in the test testResizeQueueDown?
...a/org/opensearch/common/util/concurrent/QueueResizableOpenSearchThreadPoolExecutorTests.java
Outdated
Show resolved
Hide resolved
...a/org/opensearch/common/util/concurrent/QueueResizableOpenSearchThreadPoolExecutorTests.java
Show resolved
Hide resolved
Instead of resizing down to 900, it resizes down to 1500 which guarantees that the executor has enough capacity to not reject anything if all tasks are submitted before any are able to be executed. |
a6990d9 to
210949d
Compare
|
❌ Gradle check result for 210949d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a race condition in testResizeQueueDown() where depending on random parameters we could submit up to 1002 tasks into an executor with a queue size of 900. That introduced a race condition where if the tasks didn't execute fast enough then a rejected execution exception could happen and fail the test. The fix is to resize down to a queue size of 1500 to ensure there is enough capacity even if all tasks are submitted before any can be executed. And finally I refactored the tests to reduce duplication of code and ensure the executor gets shutdown properly even in case of a test failure. This will avoid the spurious thread leak failure if a test case exits because of a failure. Signed-off-by: Andrew Ross <andrross@amazon.com>
210949d to
7ba1155
Compare
|
❌ Gradle check result for 7ba1155: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 7ba1155: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 7ba1155: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
@ashking94 Can you take another look at this? Thanks! |
Addressed concerns. No response after a few weeks.
…ct#18006) There was a race condition in testResizeQueueDown() where depending on random parameters we could submit up to 1002 tasks into an executor with a queue size of 900. That introduced a race condition where if the tasks didn't execute fast enough then a rejected execution exception could happen and fail the test. The fix is to resize down to a queue size of 1500 to ensure there is enough capacity even if all tasks are submitted before any can be executed. And finally I refactored the tests to reduce duplication of code and ensure the executor gets shutdown properly even in case of a test failure. This will avoid the spurious thread leak failure if a test case exits because of a failure. Signed-off-by: Andrew Ross <andrross@amazon.com>
…ct#18006) There was a race condition in testResizeQueueDown() where depending on random parameters we could submit up to 1002 tasks into an executor with a queue size of 900. That introduced a race condition where if the tasks didn't execute fast enough then a rejected execution exception could happen and fail the test. The fix is to resize down to a queue size of 1500 to ensure there is enough capacity even if all tasks are submitted before any can be executed. And finally I refactored the tests to reduce duplication of code and ensure the executor gets shutdown properly even in case of a test failure. This will avoid the spurious thread leak failure if a test case exits because of a failure. Signed-off-by: Andrew Ross <andrross@amazon.com>Signed-off-by: TJ Neuenfeldt <tjneu@amazon.com>
…ct#18006) There was a race condition in testResizeQueueDown() where depending on random parameters we could submit up to 1002 tasks into an executor with a queue size of 900. That introduced a race condition where if the tasks didn't execute fast enough then a rejected execution exception could happen and fail the test. The fix is to resize down to a queue size of 1500 to ensure there is enough capacity even if all tasks are submitted before any can be executed. And finally I refactored the tests to reduce duplication of code and ensure the executor gets shutdown properly even in case of a test failure. This will avoid the spurious thread leak failure if a test case exits because of a failure. Signed-off-by: Andrew Ross <andrross@amazon.com>
There was a race condition in testResizeQueueDown() where depending on
random parameters we could submit up to 1002 tasks into an executor with
a queue size of 900. That introduced a race condition where if the tasks
didn't execute fast enough then a rejected execution exception could
happen and fail the test. The fix is to resize down to a queue size of
1500 to ensure there is enough capacity even if all tasks are submitted
before any can be executed.
And finally I refactored the tests to reduce duplication of code and
ensure the executor gets shutdown properly even in case of a test
failure. This will avoid the spurious thread leak failure if a test case
exits because of a failure.
Related Issues
Resolves #14297
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.