-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover from Service-Factory deadlock #87
Conversation
86c1e52
to
22f3da3
Compare
I am unclear how waiting will solve the deadlock. The previous fix can detect the deadlock and then throw an exception to alert that there was a deadlock (versus hanging). But just catching the deadlock exception and then blocking the thread for some random time before retrying does not seem to be able to "solve" the deadlock. If A->B->C->A is the deadlock, then no amount of time waiting for A will fix anything. |
IIRC the idea is that If Thread 1 is starting at C with: C->A->B->C and Thread 2 is starting at A with A->B->C->A and we detect a deadlock then if we are lucky enough to where Thread 1 is at C or Thread 2 is at A (their starting point) then we force them to wait without the lock on their starting point to allow the other thread to continue. But it seems like a narrow gain that still can allow deadlock. |
This can't work since neither C nor A can be available to another thread until it is completely constructed (that is, the service factory returns an object) and neither can completely construct because of the cycle. |
22f3da3
to
13a7615
Compare
That is the intention yes. Of course it is not a perfect solution that works for all cases, but it works for cases like the test case, where the cycle contains optional links. Of course a cycle of required links can never be resolved. But IIRC Felix warns/errors about that if one models that. But for optional-links this is a possible way as you can see in the adjusted test-cases |
13a7615
to
4c32551
Compare
This assumes that factories are stateless. Signed-off-by: Hannes Wellmann <wellmann.hannes1@gmx.net>
4c32551
to
fa69761
Compare
Do you have further remarks on this one? |
@HannesWell as we have not seen any deadlock problems in the day passed and it seems to be there is not full agreement how to handle this maybe just postpone this? |
Yes, I'll close this for now but will leave the branch intact, just in case this becomes relevant again and somebody remembers this. |
Follow up of #68, with a first attempt to implement a recovery strategy to resolve an encountered deadlock during the use of service-factories.
In case of an encountered Deadlock between Service-Factories, this approach simply checks if the current service-factory invocation is the first one of this thread and then makes a new attempt to call the factory.
In order to make a new deadlock more unlikely the thread sleeps for a random period that becomes even longer with further attempts, hoping that the other threads involved the deadlock sleep for another time that is different enough so that the race-condition does not occur anymore.
This PR is a first draft with a relatively naive approach, but at least it worked in the test-cases added for #68:
StackWalker
, but this can be reworked to use means available in Java-8.Alternatively creates of Service-factories could be adjusted to handle encountered deadlocks, but this would likely require changes outside of Equinox and maybe some spec-work in OSGi.
I'm on vacation for the next two weeks, but wanted to share this current state/the basic idea already.