Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exit() after doing an rmprocs() takes almost 60 seconds to exit. #3685

Closed
amitmurthy opened this issue Jul 11, 2013 · 11 comments
Closed

exit() after doing an rmprocs() takes almost 60 seconds to exit. #3685

amitmurthy opened this issue Jul 11, 2013 · 11 comments

Comments

@amitmurthy
Copy link
Contributor

The following :

julia> addprocs(1)
1-element Any Array:
 2

julia> rmprocs(workers())

julia> Worker 2 terminated.


julia> exit()

results in exit() taking 60 seconds to return.

strace -p on the julia process shows a continuous call to epoll for 60 seconds.

...
epoll_wait(5, {}, 1024, 100)            = 0
epoll_wait(5, {}, 1024, 0)              = 0
epoll_wait(5, {}, 1024, 101)            = 0
epoll_wait(5, {}, 1024, 0)              = 0
epoll_wait(5, {}, 1024, 101)            = 0
...
@amitmurthy
Copy link
Contributor Author

Without a call to rmprocs() it is instantaneous

@Keno
Copy link
Member

Keno commented Jul 11, 2013

Did you terminate all the sockets correctly?

@amitmurthy
Copy link
Contributor Author

No, the sockets are not being closed due to #3495

I'll leave this open till that is fixed and then retest with a stop_reading.

But shouldn't exit() just clean up all resources and exit? What is it waiting on?

@Keno
Copy link
Member

Keno commented Jul 11, 2013

It's trying to cleanly shutdown everything, so you don't lose data. Probably still a bug that it takes so long though. Try using close to close the sockets before.

@amitmurthy
Copy link
Contributor Author

For this particular socket, the other end has already closed the connection - the worker process has exited.

@vtjnash
Copy link
Member

vtjnash commented Jul 11, 2013

this is the fault of the finalizer for a RemoteRef calling send_del_client calling worker_from_id. However, the worker has been deleted, so it goes to sleep for 60 seconds before exiting:

function worker_from_id(pg::ProcessGroup, i)
#   Processes with pids > ours, have to connect to us. May not have happened. Wait for some time.
    start = time()
    while (!haskey(map_pid_wrkr, i) && ((time() - start) < 60.0))
        sleep(0.1)      
        yield()
    end
    map_pid_wrkr[i]
end

@amitmurthy
Copy link
Contributor Author

While the above should be fixed, I think it makes sense to have

  1. exit() wait for a maximum of 10 seconds (arbitrary, I know) for a clean shutdown and it it has not happened, just go ahead and ccall system exit function.

  2. A variant of exit(code::Integer, now=false) that just does a ccall to system exit, if now = true

@StefanKarpinski
Copy link
Member

Why not make the time to wait the option? And zero means exit right away, defaults to ten.

@amitmurthy
Copy link
Contributor Author

Good idea. I agree.

Since jl_exit has already started the shutdown process, I guess we cannot use any of the regular calls to setup a Julia timer...

Can the implementation of this use SIGALRM - set using the alarm system call - for the process to be terminated after time seconds?

Windows will need an equivalent to SIGALRM

Or can libuv provide any support for this?

@amitmurthy
Copy link
Contributor Author

Should I file the requirement for an exit with a wait time as a separate issue? I think it is a valid requirement since the finalizers in user code can also cause delays for a clean exit.

@vtjnash
Copy link
Member

vtjnash commented Jul 12, 2013

no. it is an awful idea to allow a finalizer to sleep. this could break your code in any number of ways. I will try to push a fix that turns calls from finalizers to yield into errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants