Skip to content
This repository has been archived by the owner on Jun 17, 2024. It is now read-only.

85 - Exception instead of actor stop on cluster failure? #96

Closed
wants to merge 2 commits into from
Closed

85 - Exception instead of actor stop on cluster failure? #96

wants to merge 2 commits into from

Conversation

michielboekhoff
Copy link
Contributor

@michielboekhoff michielboekhoff commented Jun 19, 2017

Throwing exceptions after maxRetries connection attempts.

This should fix #85.

@ahjohannessen
Copy link
Contributor

By default the connection actor is restarted, right?

@michielboekhoff
Copy link
Contributor Author

@ahjohannessen By default, the connection actor is stopped after max retries.

@ahjohannessen
Copy link
Contributor

I see, and if you set a config value to -1 it keeps trying indefinitely?

@michielboekhoff
Copy link
Contributor Author

@ahjohannessen Not too sure about that right now. I can check, but it'd have to be a bit later.

@evan108108
Copy link

🤘 This would make life much easier!

@michielboekhoff
Copy link
Contributor Author

I'll fix the tests momentarily - I'm currently not able to run ES.

@michielboekhoff
Copy link
Contributor Author

Closing really quickly to trigger another build - not able to replicate any of these errors locally.

@michielboekhoff
Copy link
Contributor Author

@t3hnar Any idea what could be going on here? Locally everything is passing and I've got no local changes.

@t3hnar
Copy link
Contributor

t3hnar commented Jun 23, 2017

@michielboekhoff are you sure it does pass locally?

@michielboekhoff
Copy link
Contributor Author

@t3hnar I'm reasonably certain: The results of sbt clean test && git branch.

@michielboekhoff
Copy link
Contributor Author

Finally managed to replicate one of the failing tests. It's got to do with platform (I was initially running the tests under Windows, but Ubuntu Precise errors, and Windows does not).

@michielboekhoff
Copy link
Contributor Author

... In fact, now it's failing on master. @t3hnar, any suggestions?

@ahjohannessen
Copy link
Contributor

@michielboekhoff Perhaps it is because of a slow server, it might help setting this in reference.conf in resources under test:

akka.test.timefactor = 1.0
akka.test.timefactor = ${?AKKA_TEST_TIMEFACTOR}

and then set an environment variable, e.g. AKKA_TEST_TIMEFACTOR=2.0, in .travis.yml.

@michielboekhoff
Copy link
Contributor Author

@ahjohannessen I'm starting to suspect this as well. I'll try that.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.06%) to 81.746% when pulling d228e8e on michielboekhoff:exception-retries into 1d9aff9 on EventStore:master.

@michielboekhoff
Copy link
Contributor Author

Right, so now the tests partially pass and partially don't. I'm really confused.

@ahjohannessen
Copy link
Contributor

ahjohannessen commented Apr 12, 2018

@michielboekhoff

Not sure I see the point in throwing an exception. If the purpose of restarting the connection actor is to keep subscribers going then this does not solve that in the current code. You get a fresh connection actor and have to deal with all the stopped subscribers as most of the code that relies on the connection actor uses death watch, e.g. the stage actor in SourceStageLogic:

case (_, Terminated(`connection`))  completeStage()

and PersistentSubscriptionActor:

case Event(Terminated(_), _) =>stop()

Setting the retry config value to -1 keeps the connection actor from shutting down and trying indefinitely if that is what you want.

@michielboekhoff
Copy link
Contributor Author

@ahjohannessen I have to be honest, it's been a while since I had a look at this.

The attached issue #85 says this is not in line with Akka ideology. I believe the point is also for the end user to get more information as to the nature of the exception that occurred. FWIW, this should already be covered by the library anyway, but it's just a matter of taste I suppose. If we don't want to support this behaviour, that's fine.

@ahjohannessen
Copy link
Contributor

I believe the point is also for the end user to get more information as to the nature of the exception that occurred

Yep, I understand that.

@michielboekhoff
Copy link
Contributor Author

Due to the lack of activity on this, I'm gonna close this PR. Feel free to reopen if it's of any interest.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exception instead of actor stop on cluster failure?
5 participants