-
-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
I'm not sure about this... Please give more informations about this ! We should probably revert the changes |
I just added a check to make sure that we can stop the environments if the session_pid is defined. If it is not defined it means that the session was already closed and no environments should be stopped. The problem came when the session was stopping, it tried to count the children of a bad session_pid which caused an error and crashed the server. I don't think that this should be reverted as it is just adding a simple check, the session is still properly stopped and the environments too. |
Ok I understand but this PR could create some issue... If the session_id is not a pid, it means there is some error way before in the execution flow. This is why the critical error say that there is an issue that should not happen. I'll try to explain why i think this should be reverted. # Here we take the Session.DynamicSupervisor pid.
# This is the supervisor that handle every session for this environment.
session_pid = Swarm.whereis_name(Session.DynamicSupervisor.get_name(env_id))
# You check if the session_pid is actually a pid.
# Here, if the session_pid is NOT a Pid, we have an other issue because this should NEVER be the case.
# If the pid is nil for example, this means the dynamic supervisor crashed for whatever reason OR Swarm cannot find the pid in the register.
# Since the supervisor does not handle any business logic, it should NEVER crash on its own.
# And if it IS crashing for whatever reason, the environment supervisor should restart it on the fly anyway.
# So fixing this is like adding rubber band to fix the leak instead of properly fixing the leak
if is_pid(session_pid) do
...
end I can think of 2 possible issues : This needs some more investigation, but this fix is not the proper one IMO. |
You're right, this still needs some investigation and we still need to add some sort of error handling to stop the server from crashing. I don't think that this should be reverted because this causes more problems reverted than not. |
I'll try to be as clear as possible, we can discuss it if not :) The addition of the if is just to avoid a 500 error when we want to stop the For the first case: For the second:
Sure, but the real solution is to find out why dynamic supervisors crash and don't try to restart the supervisor. it's not worth spending time on a simple if... |
Ok but it's not about this "if" but more about fixing the real issue instead of hiding the error. |
Closes #
Description of the changes
Checklist
I included unit tests that cover my changes
I added/updated the documentation about my changes
Technical highlight/advice