-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flakiness in MasterServiceTests.testThrottlingForMultipleTaskTypes #8901
Fix flakiness in MasterServiceTests.testThrottlingForMultipleTaskTypes #8901
Conversation
Gradle Check (Jenkins) Run Completed with:
|
@andrross I think we should instead handle timeout of 0 to fail the task immediately instead of scheduling the asynchronous task to remove it from the queue on timeout. |
@sohami I don't love the inherent non-determinism here, but is a zero timeout useful in practice here? I'm assuming not (why schedule a task that is already timed out?), so I also don't love putting in additional logic for this case. The test would also just be testing this additional logic of the zero timeout case and not testing the normal timeout path, so I don't think it helps with coverage. Does that make sense or is there a different way to handle this case? |
@sohami I think it is probably better to deterministically test the timeout case in a separate test case. I'm thinking something like:
What do you think? |
I don't foresee any use case with 0 timeout but seems like we are allowing that to be set anyways. If we want to allow it than handling it separately would avoid adding extra timeout tasks for which most of the time behavior will be indeterministic. The other option is if there is no use case for timeout of 0 should we fail the task setting it to that value instead ? This will also help to catch issues probably where it is set to 0 incorrectly. |
Gradle Check (Jenkins) Run Completed with:
|
Sounds good to me. thanks! |
c6a8457
to
951ff07
Compare
The test configured a [timeout duration of zero][1] for certain tasks and asserted that all tasks were throttled or timed out. This is not a valid assertion because it is possible for a task to complete before the [asynchronous timeout operation runs][2], which means the task would complete successfully. The fix is to adjust the assertion to allow for successful tasks in this case. [1]: https://github.com/opensearch-project/OpenSearch/blob/60985bc300d9eafd36c1ab25d46235e1c925c565/server/src/test/java/org/opensearch/cluster/service/MasterServiceTests.java#L941 [2]: https://github.com/opensearch-project/OpenSearch/blob/9fc3f4096958159ec9b53012fc7ced19fd793e1b/server/src/main/java/org/opensearch/common/util/concurrent/PrioritizedOpenSearchThreadPoolExecutor.java#L266 Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
951ff07
to
8c5f7e0
Compare
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #8901 +/- ##
============================================
- Coverage 71.05% 70.99% -0.06%
Complexity 57198 57198
============================================
Files 4758 4758
Lines 269863 269863
Branches 39484 39484
============================================
- Hits 191755 191596 -159
- Misses 61982 62128 +146
- Partials 16126 16139 +13 |
#8901) * Fix flakiness in MasterServiceTests.testThrottlingForMultipleTaskTypes The test configured a [timeout duration of zero][1] for certain tasks and asserted that all tasks were throttled or timed out. This is not a valid assertion because it is possible for a task to complete before the [asynchronous timeout operation runs][2], which means the task would complete successfully. The fix is to adjust the assertion to allow for successful tasks in this case. [1]: https://github.com/opensearch-project/OpenSearch/blob/60985bc300d9eafd36c1ab25d46235e1c925c565/server/src/test/java/org/opensearch/cluster/service/MasterServiceTests.java#L941 [2]: https://github.com/opensearch-project/OpenSearch/blob/9fc3f4096958159ec9b53012fc7ced19fd793e1b/server/src/main/java/org/opensearch/common/util/concurrent/PrioritizedOpenSearchThreadPoolExecutor.java#L266 Signed-off-by: Andrew Ross <andrross@amazon.com> * Add a deterministic test case for timeout Signed-off-by: Andrew Ross <andrross@amazon.com> --------- Signed-off-by: Andrew Ross <andrross@amazon.com> (cherry picked from commit e2a664c) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-project#8901) * Fix flakiness in MasterServiceTests.testThrottlingForMultipleTaskTypes The test configured a [timeout duration of zero][1] for certain tasks and asserted that all tasks were throttled or timed out. This is not a valid assertion because it is possible for a task to complete before the [asynchronous timeout operation runs][2], which means the task would complete successfully. The fix is to adjust the assertion to allow for successful tasks in this case. [1]: https://github.com/opensearch-project/OpenSearch/blob/60985bc300d9eafd36c1ab25d46235e1c925c565/server/src/test/java/org/opensearch/cluster/service/MasterServiceTests.java#L941 [2]: https://github.com/opensearch-project/OpenSearch/blob/9fc3f4096958159ec9b53012fc7ced19fd793e1b/server/src/main/java/org/opensearch/common/util/concurrent/PrioritizedOpenSearchThreadPoolExecutor.java#L266 Signed-off-by: Andrew Ross <andrross@amazon.com> * Add a deterministic test case for timeout Signed-off-by: Andrew Ross <andrross@amazon.com> --------- Signed-off-by: Andrew Ross <andrross@amazon.com>
#8901) (#8934) * Fix flakiness in MasterServiceTests.testThrottlingForMultipleTaskTypes The test configured a [timeout duration of zero][1] for certain tasks and asserted that all tasks were throttled or timed out. This is not a valid assertion because it is possible for a task to complete before the [asynchronous timeout operation runs][2], which means the task would complete successfully. The fix is to adjust the assertion to allow for successful tasks in this case. [1]: https://github.com/opensearch-project/OpenSearch/blob/60985bc300d9eafd36c1ab25d46235e1c925c565/server/src/test/java/org/opensearch/cluster/service/MasterServiceTests.java#L941 [2]: https://github.com/opensearch-project/OpenSearch/blob/9fc3f4096958159ec9b53012fc7ced19fd793e1b/server/src/main/java/org/opensearch/common/util/concurrent/PrioritizedOpenSearchThreadPoolExecutor.java#L266 * Add a deterministic test case for timeout --------- (cherry picked from commit e2a664c) Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
After the fix in opensearch-project#8901, I ran the following in a loop all day (about 1400 times) and saw no failures: ``` ./gradlew ':server:test' --tests "org.opensearch.cluster.service.MasterServiceTests" ``` Signed-off-by: Andrew Ross <andrross@amazon.com>
opensearch-project#8901) * Fix flakiness in MasterServiceTests.testThrottlingForMultipleTaskTypes The test configured a [timeout duration of zero][1] for certain tasks and asserted that all tasks were throttled or timed out. This is not a valid assertion because it is possible for a task to complete before the [asynchronous timeout operation runs][2], which means the task would complete successfully. The fix is to adjust the assertion to allow for successful tasks in this case. [1]: https://github.com/opensearch-project/OpenSearch/blob/60985bc300d9eafd36c1ab25d46235e1c925c565/server/src/test/java/org/opensearch/cluster/service/MasterServiceTests.java#L941 [2]: https://github.com/opensearch-project/OpenSearch/blob/9fc3f4096958159ec9b53012fc7ced19fd793e1b/server/src/main/java/org/opensearch/common/util/concurrent/PrioritizedOpenSearchThreadPoolExecutor.java#L266 Signed-off-by: Andrew Ross <andrross@amazon.com> * Add a deterministic test case for timeout Signed-off-by: Andrew Ross <andrross@amazon.com> --------- Signed-off-by: Andrew Ross <andrross@amazon.com>
After the fix in #8901, I ran the following in a loop all day (about 1400 times) and saw no failures: ``` ./gradlew ':server:test' --tests "org.opensearch.cluster.service.MasterServiceTests" ``` Signed-off-by: Andrew Ross <andrross@amazon.com>
After the fix in #8901, I ran the following in a loop all day (about 1400 times) and saw no failures: ``` ./gradlew ':server:test' --tests "org.opensearch.cluster.service.MasterServiceTests" ``` Signed-off-by: Andrew Ross <andrross@amazon.com> (cherry picked from commit 8b751f8) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
After the fix in #8901, I ran the following in a loop all day (about 1400 times) and saw no failures: ``` ./gradlew ':server:test' --tests "org.opensearch.cluster.service.MasterServiceTests" ``` (cherry picked from commit 8b751f8) Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-project#8901) * Fix flakiness in MasterServiceTests.testThrottlingForMultipleTaskTypes The test configured a [timeout duration of zero][1] for certain tasks and asserted that all tasks were throttled or timed out. This is not a valid assertion because it is possible for a task to complete before the [asynchronous timeout operation runs][2], which means the task would complete successfully. The fix is to adjust the assertion to allow for successful tasks in this case. [1]: https://github.com/opensearch-project/OpenSearch/blob/60985bc300d9eafd36c1ab25d46235e1c925c565/server/src/test/java/org/opensearch/cluster/service/MasterServiceTests.java#L941 [2]: https://github.com/opensearch-project/OpenSearch/blob/9fc3f4096958159ec9b53012fc7ced19fd793e1b/server/src/main/java/org/opensearch/common/util/concurrent/PrioritizedOpenSearchThreadPoolExecutor.java#L266 Signed-off-by: Andrew Ross <andrross@amazon.com> * Add a deterministic test case for timeout Signed-off-by: Andrew Ross <andrross@amazon.com> --------- Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Kaushal Kumar <ravi.kaushal97@gmail.com>
After the fix in opensearch-project#8901, I ran the following in a loop all day (about 1400 times) and saw no failures: ``` ./gradlew ':server:test' --tests "org.opensearch.cluster.service.MasterServiceTests" ``` Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Kaushal Kumar <ravi.kaushal97@gmail.com>
opensearch-project#8901) * Fix flakiness in MasterServiceTests.testThrottlingForMultipleTaskTypes The test configured a [timeout duration of zero][1] for certain tasks and asserted that all tasks were throttled or timed out. This is not a valid assertion because it is possible for a task to complete before the [asynchronous timeout operation runs][2], which means the task would complete successfully. The fix is to adjust the assertion to allow for successful tasks in this case. [1]: https://github.com/opensearch-project/OpenSearch/blob/60985bc300d9eafd36c1ab25d46235e1c925c565/server/src/test/java/org/opensearch/cluster/service/MasterServiceTests.java#L941 [2]: https://github.com/opensearch-project/OpenSearch/blob/9fc3f4096958159ec9b53012fc7ced19fd793e1b/server/src/main/java/org/opensearch/common/util/concurrent/PrioritizedOpenSearchThreadPoolExecutor.java#L266 Signed-off-by: Andrew Ross <andrross@amazon.com> * Add a deterministic test case for timeout Signed-off-by: Andrew Ross <andrross@amazon.com> --------- Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Ivan Brusic <ivan.brusic@flocksafety.com>
After the fix in opensearch-project#8901, I ran the following in a loop all day (about 1400 times) and saw no failures: ``` ./gradlew ':server:test' --tests "org.opensearch.cluster.service.MasterServiceTests" ``` Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Ivan Brusic <ivan.brusic@flocksafety.com>
opensearch-project#8901) * Fix flakiness in MasterServiceTests.testThrottlingForMultipleTaskTypes The test configured a [timeout duration of zero][1] for certain tasks and asserted that all tasks were throttled or timed out. This is not a valid assertion because it is possible for a task to complete before the [asynchronous timeout operation runs][2], which means the task would complete successfully. The fix is to adjust the assertion to allow for successful tasks in this case. [1]: https://github.com/opensearch-project/OpenSearch/blob/60985bc300d9eafd36c1ab25d46235e1c925c565/server/src/test/java/org/opensearch/cluster/service/MasterServiceTests.java#L941 [2]: https://github.com/opensearch-project/OpenSearch/blob/9fc3f4096958159ec9b53012fc7ced19fd793e1b/server/src/main/java/org/opensearch/common/util/concurrent/PrioritizedOpenSearchThreadPoolExecutor.java#L266 Signed-off-by: Andrew Ross <andrross@amazon.com> * Add a deterministic test case for timeout Signed-off-by: Andrew Ross <andrross@amazon.com> --------- Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
After the fix in opensearch-project#8901, I ran the following in a loop all day (about 1400 times) and saw no failures: ``` ./gradlew ':server:test' --tests "org.opensearch.cluster.service.MasterServiceTests" ``` Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
The test configured a timeout duration of zero for certain tasks and asserted that all tasks were throttled or timed out. This is not a valid assertion because it is possible for a task to complete before the asynchronous timeout operation runs, which means the task would complete successfully. The fix is to adjust the assertion to allow for successful tasks in this case.
Related Issues
Resolves #5958
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.