-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Parallelize sync_end to remove async hang mechanism #39007
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
@StefanKarpinski , @JeffBezanson - bumping. I know you guys are busy, so if you want to push this one out or just kill it, I'm okay. |
t = take!(c) | ||
if t isa Exception # Exception from monitor. Collect | ||
c_ex = CompositeException([t]) # any other exceptions and throw | ||
while isready(c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand this logic.
Don't you run the risk that the channel is empty, despite their being other tasks running which may fail at a later time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That there are running Tasks that fail at a later time is a very real possibility - once we throw the first Exception(s) in the Composite, we stop looking at the rest. But the only alternative is to wait for all Tasks to complete, which can cause the same hang we are trying to avoid in the first place. (eg if one of the running Tasks is waiting for Channel input from a dead one.) This PR doesn't get us true structured concurrency - it just guarantees an Exception is thrown to help with debugging the system.
@JeffBezanson - bumping. If you are too busy, no worries, but I have a window of about 11 days to respond to major feedback quickly. Just letting you know. |
Gents - I have some time the next few weeks to turn in this direction, if there was something you all were still looking for. |
Targets #32677, following up from #38916. This takes the parallel
Experimental.sync_end
and adds error handling (withCompositeException
) to match the existingBase.sync_end
. Adds lockup tests to boththreads_exec
andDistributed_exec
test suites and pointsExperimental.@sync
back at the newsync_end
.@StefanKarpinski , @JeffBezanson , @vtjnash - this is just submitting the results from those previous conversations. If you guys want to toss this, no issue - just wanted to give you the option. Feedback welcome.