-
Notifications
You must be signed in to change notification settings - Fork 893
Address a race condition in libevent select. #6784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is not really a fix for the race condition because I could not figure out how it happen, but it does address the problem generated by the race. If we do not remove a bad fd from the select list we keep getting the same error from select, and we stop doing any progress on the communication side. Thus, we forcefully disable all bad fd as soon as select fails, and we are back in track, progress ensure and everything seems to work as expected (no leftover events in the event base). Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Would you please push this upstream? It affects a lot of us who use libevent but don't embed it. |
Any reason this should not be merged? |
This looks like a good change. Not merged with upstream libevent though - but no complaints over it. Adding it to the 5.0 milestone for tracking. |
5.0 is slated to branch on April 30th. Since this is a bug fix, it doesn't need to get in by then, but the sooner the better. Thanks! |
I ping'd upstream libevent on it - if they sign off I think we should pull it in. |
Based on the comments in libevent/libevent#911 this should be good to go - they just have a minor nit on syntax and want a unit test before they approve. |
@gpaulsen do you agree this is good to go? |
I forgot we don't patch other components anymore - my bad. Revert here: |
In some rare instances (race), when an fd is closed, the libevent will enter an infinite loop.
If we do not remove a bad fd from the select list we keep getting the same error from select, and we stop doing any progress on the communication side. Thus, we forcefully disable all bad fd as soon
as select fails, and we are back in track, progress ensure and everything seems to work as expected (no leftover events in the event base).
Signed-off-by: George Bosilca bosilca@icl.utk.edu