Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't drain mode: worker threads are interrupted on shutdown #559

Closed
dannpopescu opened this issue Feb 23, 2023 · 2 comments
Closed

Don't drain mode: worker threads are interrupted on shutdown #559

dannpopescu opened this issue Feb 23, 2023 · 2 comments

Comments

@dannpopescu
Copy link

Hi,

I’ve already written about a bug we’ve discovered with the DRAIN shutdown mode here: #552. After we realized the PC keeps polling messages in the DRAIN mode we chose the DONT_DRAIN mode to gracefully shutdown the PC, but it behaves unexpectedly too.

The version in use is 0.5.2.4.

Our use case is pretty standard I would say. We use only the core module and message processing involves network calls to other services. When calling them, errors are expected, so we commit the offsets in both the happy case and not-so-happy case.

The problem comes at the shutdown. In the DONT_DRAIN mode the PC will shutdown the worker threads via interrupts. This means that all in-progress messages will fail on network calls. Of course, we don’t want to commit the failed messages in this case. At the moment, to avoid committing the message’s offset, we catch all the exceptions thrown by the processing code and check if the root cause is an InterruptedException. If it’s the case then throw the exception out of the user function to the PC. This way the PC puts the message in the retry queue and doesn’t commit it. This approach works at the moment, but it’s kinda hacky I would say.

All this said, do you think it’s reasonable to change the way in which the worker thread pool is shutdown so that it doesn’t interrupt the worker threads, but instead lets the already scheduled tasks to finish gracefully? This is basically adhering to DONT_DRAIN's current description:

Stop downloading more messages, and stop procesing more messages in the queue, but finish processing messages already being processed locally.

@antonmos
Copy link
Contributor

antonmos commented Jul 25, 2023

@niamhthornbury I noticed that you closed #593 that was trying to address this issue. Could you share any plans to address this?

@rkolesnev
Copy link
Contributor

@dannpopescu, @antonmos - We just released 0.5.2.6 build of PC - this and #552 are both fixed in it.
Closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants