fix #500 to avoid potential hang and event loss #501
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
IfUsed()
for worker to check if it is used.Get()
for routine that generates event before using the worker andPut()
when after using the worker. They should not be called concurently(only one producer). So it panics if unexpected behavior happen.The reasoning for that is as follows.
When returned from ew.Close(), there are two possiblities:
happen before ew.Close()
When no routine can touch it (i.e.,ew.IfUsed == false),
we just drain the ew.incoming and return.
When one routine can touch it (i.e.,ew.IfUsed == true), we ensure
that we only return after the routine can not touch it
(i.e.,ew.IfUsed == false). At this point, we can ensure that no
other routine will touch it and send events through the ew.incoming.
So, we return.
Because eworker has been deleted from workqueue after ew.Close()
(ordered by a workqueue lock), at this point, we can ensure that
no ew will not be touched even in the future. So the return is
safe.
Finally, to verify it, in ruitianzhong@e5bce57 , run
go test -v ./pkg/event_processor/ -run TestHang
, no hang happened.Because the
poc
inject delay to reliably reproduce the hang, I do not add it as a unit test in bug-fix branch.