-
-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Panic on reconnect #192
Comments
I can't advise an easy workaround. It looks at first blush like a bug in the req code that is calling Dup() on a nil message. We could have Dup() check for nil, which might be one approach, but it isn't a real fix. I'll try to get to this soon, but it might be a day or two as I'm fairly buried in work on other projects for $dayjob. |
ok @gdamore let my test the fix ( when done)k . And thank you for your fast response.!!! |
I see you are using 1.3.0. However, I've made changes since then, which may address this. stay tuned. |
Yeah, I don't think that would have fixed it. I'm still trying to determine how the circumstance arises that the reqMsg is nil at this time. |
Please have a look at branch ged/bug192 -- I'm not sure if it will solve it for you, but I think it might. I'd at least like to know if it doesn't. |
sorry @gdamore ; something happened when trying to get this specific branch version.
How can I add to my gomodules this branch version? |
when trying to get from github, similar error.
|
with commitid seems to work..
I will rebuild and test with this version... and will give you feedback.. |
Sorry the patch seems not being the solution. Panic again when "ctrl+C" on the agent which is connecting to the master waitting for new jobs".
Any other test that we can do? |
Hello @gdamore , after working a lot with this panic, we could see that it only happened when master has sent a message and waiting for response from the agents, and one of them, is killed ( Ctrl+C) if no message waiting the panic doesn't happen. The agent which is provoking the panic, could not execute the |
Ok, thanks. I should have time this weekend to work it out.
|
I think I've pushed an update to that branch that should fix it. Please test again (commit is c4b7a01 ) and let me know. |
Hello @gdamore . Good News! It seems working fine, with one agent running and restarting !!!! let me maintain opened for a days where I could test with more than one agent at once. Thank you very much! |
I'm going to go ahead and merge this now. |
Hello, first and thank you for this great project.
I'm working with mangos as Req(master)/Resp(agents) protocol with 1 master listening for connections and distributing jobs and N agents connecting to the master and doing distributed work.
I did a PoC as I commented here #189 , in the poc all working fine.
But with real data we have Panics in the master, after agents have been restarted and trying to reconnect again.
There is any workaround to avoid this error?
The text was updated successfully, but these errors were encountered: