Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not close threadpool if termination fails #42774

Closed
wants to merge 1 commit into from

Conversation

jaymode
Copy link
Member

@jaymode jaymode commented May 31, 2019

This commit changes the code so that the threadpool is not closed
unless termination succeeds. Otherwise there can still be running tasks
that rely on resources that are closed by closing the threadpool.

Additionally, there is a test fix included for the NodeTests that
ensures the submitted task is actually running prior to closing the
node in the test.

Closes #42577

This commit changes the code so that the threadpool is not closed
unless termination succeeds. Otherwise there can still be running tasks
that rely on resources that are closed by closing the threadpool.

Additionally, there is a test fix included for the NodeTests that
ensures the submitted task is actually running prior to closing the
node in the test.

Closes elastic#42577
@jaymode jaymode added >enhancement :Core/Infra/Core Core issues without another label v8.0.0 v7.3.0 labels May 31, 2019
@jaymode jaymode requested a review from jpountz May 31, 2019 20:59
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@jpountz
Copy link
Contributor

jpountz commented Jun 4, 2019

I'm a bit on the fence about having closeable objects that get never closed in some cases.

I was reviewing ContextPreservingAbstractRunnable whose comments suggest that it already has logic to work around the case when threadpools are terminated and some threads get interrupted. It does the following:

            boolean whileRunning = false;
            threadsOriginalContext = stashContext();
            try {
                creatorsContext.restore();
                whileRunning = true;
                in.doRun();
                whileRunning = false;
            } catch (IllegalStateException ex) {
                if (whileRunning || threadLocal.closed.get() == false) {
                    throw ex;
                }
                // if we hit an ISE here we have been shutting down
                // this comes from the threadcontext and barfs if
                // our threadpool has been shutting down
            }

My interpretation of this code block is that we want to ignore exceptions that signal that the thread context is already closed unless the exception is thrown by the wrapped runnable. Should we move the call to stashContext within the try block? (Please take it as a question rather than a recommendation, I don't get all implications.)

@jaymode
Copy link
Member Author

jaymode commented Jun 10, 2019

My interpretation of this code block is that we want to ignore exceptions that signal that the thread context is already closed unless the exception is thrown by the wrapped runnable

I think that is the correct interpretation. Moving stashContext makes sense but there is still an issue especially when security is in place and that is due to the fact that running tasks could still be using the thread context and that means we still have this issue.

I am not very happy with this fix to be honest either. One way to handle this could be ref counting but that introduces complexity for not much gain IMO

jaymode added a commit to jaymode/elasticsearch that referenced this pull request Jun 14, 2019
There are a few tests within NodeTests that submit items to the
threadpool and then close the node. The tests are designed to check
how running tasks are affected during node close. These tests can cause
CI failures since the submitted tasks may not be running when the node
is closed and then execute after the thread context is closed, which
triggers an unexpected exception. This change ensures the threads are
running so we avoid the unexpected exception and can test these cases.

The test of task submittal while a node is closing is also important so
an additional but muted test has been added that tests the case where a
task may be getting submitted while the node is closing and ensuring we
do not trigger anything unexpected in these cases.

Relates elastic#42774
Relates elastic#42577
jaymode added a commit that referenced this pull request Jun 14, 2019
There are a few tests within NodeTests that submit items to the
threadpool and then close the node. The tests are designed to check
how running tasks are affected during node close. These tests can cause
CI failures since the submitted tasks may not be running when the node
is closed and then execute after the thread context is closed, which
triggers an unexpected exception. This change ensures the threads are
running so we avoid the unexpected exception and can test these cases.

The test of task submittal while a node is closing is also important so
an additional but muted test has been added that tests the case where a
task may be getting submitted while the node is closing and ensuring we
do not trigger anything unexpected in these cases.

Relates #42774
Relates #42577
@jaymode
Copy link
Member Author

jaymode commented Jun 14, 2019

The test changes in this PR have been incorporated into master by #43240 and I believe that #43249 is a better solution to the problem at hand, so I am closing this PR.

@jaymode jaymode closed this Jun 14, 2019
jaymode added a commit that referenced this pull request Jun 14, 2019
There are a few tests within NodeTests that submit items to the
threadpool and then close the node. The tests are designed to check
how running tasks are affected during node close. These tests can cause
CI failures since the submitted tasks may not be running when the node
is closed and then execute after the thread context is closed, which
triggers an unexpected exception. This change ensures the threads are
running so we avoid the unexpected exception and can test these cases.

The test of task submittal while a node is closing is also important so
an additional but muted test has been added that tests the case where a
task may be getting submitted while the node is closing and ensuring we
do not trigger anything unexpected in these cases.

Relates #42774
Relates #42577
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] org.elasticsearch.node.NodeTests failures
3 participants