Large amounts of output leads to uninterruptable kernel hang #372

JobJob · 2015-10-11T21:33:29Z

Hi, love IJulia, brilliant work.

Currently running into a bug, I've managed to whittle reproduction down to this:

for i=1:100000
    @show i
end

Putting that in a notebook leads to the kernel "hanging" (shows the * in the notebook) and being uninterruptable (via the stop button), so every time I output "too much" I need to restart the kernel (restarting works fine, as does saving, while this is happening)

That's the picture after scrolling to the bottom of the output.

With verbose = true, the last message is the output up to the number that the notebook shows:

SENDING IPython Msg [ idents stdout ] {
  header = Dict("msg_id"=>"69ba5b82-bf10-43cf-8448-a41f6543dac0","msg_type"=>"stream","username"=>"username","version"=>"5.0","session"=>"C4F03FFDAA12480188C988F545458EF8"),
  metadata = Dict{AbstractString,Any}(),
  content = Dict("name"=>"stdout","text"=>"i = 19424\ni = 19425\ni = 19426\ni = 19427\n
...
ni = 25787\ni = 25788\ni = 25789")
}
[I 00:10:12.219 NotebookApp] Saving file at /bugs.ipynb

I'm on IJulia master (and have run Pkg.build("IJulia") since getting the latest julia and Pkg.checkout("IJulia")), OSX 10.9, julia 0.4.0, jupyter 4.0.6 (installed via pip). Presumably this is related to #347 but thought I'd post separately, since the symptoms seem slightly more severe. Cheers.

The text was updated successfully, but these errors were encountered:

stevengj · 2015-10-11T23:28:27Z

Duplicate of #243, I think.

JobJob · 2015-10-12T10:43:50Z

Eek, sorry about that. Btw, I failed to mention that it doesn't always get to the same point in the count, usually it makes it to somewhere in the 11000 - 13000 range. Sometimes it only makes it to 1, here is the terminal output for that case:

got msg part EFC38BC204B14F5883D7D4C594965ABD
got msg part <IDS|MSG>
RECEIVED IPython Msg [ idents EFC38BC204B14F5883D7D4C594965ABD ] {
  header = Dict{AbstractString,Any}("msg_id"=>"B3A1E479A43F4CBE957788604F00F1DC","msg_type"=>"execute_request","username"=>"username","version"=>"5.0","session"=>"EFC38BC204B14F5883D7D4C594965ABD"),
  metadata = Dict{AbstractString,Any}(),
  content = Dict{AbstractString,Any}("store_history"=>true,"user_expressions"=>Dict{AbstractString,Any}(),"allow_stdin"=>true,"code"=>"for i=1:100000\n    @show i\nend","silent"=>false,"stop_on_error"=>true)
}
SENDING IPython Msg [ idents status ] {
  header = Dict("msg_id"=>"cc230aee-8727-486c-9318-5e6fb6978dba","msg_type"=>"status","username"=>"jlkernel","version"=>"5.0","session"=>"EFC38BC204B14F5883D7D4C594965ABD"),
  metadata = Dict{AbstractString,Any}(),
  content = Dict("execution_state"=>"busy")
}
EXECUTING for i=1:100000
    @show i
end
SENDING IPython Msg [ idents execute_input ] {
  header = Dict("msg_id"=>"b888195e-c6d5-420b-b596-e5b30dd41123","msg_type"=>"execute_input","username"=>"username","version"=>"5.0","session"=>"EFC38BC204B14F5883D7D4C594965ABD"),
  metadata = Dict{AbstractString,Any}(),
  content = Dict{ASCIIString,Any}("code"=>"for i=1:100000\n    @show i\nend","execution_count"=>19)
}
SENDING IPython Msg [ idents stdout ] {
  header = Dict("msg_id"=>"ef01eca8-16eb-47f2-a65d-78c882ef4c66","msg_type"=>"stream","username"=>"username","version"=>"5.0","session"=>"EFC38BC204B14F5883D7D4C594965ABD"),
  metadata = Dict{AbstractString,Any}(),
  content = Dict("name"=>"stdout","text"=>"i = 1")
}
[I 13:37:38.189 N

These both seem to run fine for me:

for i=1:100000
    println("1")
end

for i=1:100000
    println("$(round(Int, rand()))")
end

Looking at the IPython messages in the terminal it seems to be breaking the command output up into smaller messages more reliably for some reason.

stevengj · 2015-10-12T13:16:36Z

Hmm, weird, it might be a different issue then.

JobJob · 2015-10-12T22:50:56Z

So, when I change watch_stream in stdio.jl:44 from sleep(0.1) to sleep(0.01) my problem goes away. 1) This is I guess one way to control how big the chunks read from the kernel's stdout are (when there is a high rate of output)?

I added debug lines either side of the sleep(0.1) like so:

@vprintln("sleep")
sleep(0.1) # a little delay to accumulate output
@vprintln("wake")

and when I get it to hang, e.g. the original example, the wake doesn't get printed, so 2) maybe the sleep wasn't returning for some reason? 3) Is it hitting the deadlock mentioned in #243 and JuliaLang/julia#8789 because nothing is being read off before 64KB is put on STDOUT? In which case 4) I wonder if we shouldn't remove the sleep completely or change it to sleep(0.001)? That way we minimise the chance of 64KB being written to stdout before it is read.

Instead of sleeping during the read loop to throttle the number of stream messages sent, we now continually read the read_stdout and read_stderr streams (whose buffers are limited to 64KB on OSX/Linux, 4KB Windows?) and add data read into our own IOBuffers for STDOUT/STDERR. The rate of sending is now controlled by a stream_interval parameter; a stream message is now sent at most once every stream_interval seconds (currently 0.1). The exception to this is if the buffer has reached the size specified in max_size (currently 10KB), and will then be sent immediately. This is to avoid overly large stream messages being sent. Improves flush() so that it will have a very high chance of flushing data already written to stdout/stderr. The above changes fix JuliaLang#372 JuliaLang#342 JuliaLang#238 347 Adds timestamps to the debug logging and the task which the vprintln call is made from. Fixes using ?keyword (e.g. ?try)

Instead of sleeping during the read loop to throttle the number of stream messages sent, we now continually read the read_stdout and read_stderr streams (whose buffers are limited to 64KB on OSX/Linux, 4KB Windows?) and add data read into our own IOBuffers for STDOUT/STDERR. The rate of sending is now controlled by a stream_interval parameter; a stream message is now sent at most once every stream_interval seconds (currently 0.1). The exception to this is if the buffer has reached the size specified in max_size (currently 10KB), and will then be sent immediately. This is to avoid overly large stream messages being sent. Improves flush() so that it will have a very high chance of flushing data already written to stdout/stderr. The above changes fix JuliaLang#372 JuliaLang#342 JuliaLang#238 JuliaLang#347 Adds timestamps to the debug logging and the task which the vprintln call is made from. Fixes using ?keyword (e.g. ?try)

stevengj closed this as completed Oct 11, 2015

stevengj reopened this Oct 12, 2015

JobJob mentioned this issue Nov 24, 2015

RFC: Numerous IO related fixes #392

Merged

stevengj closed this as completed in #392 Nov 27, 2015

JobJob mentioned this issue Feb 11, 2016

IJulia array size limit when looping over 2d array? #406

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large amounts of output leads to uninterruptable kernel hang #372

Large amounts of output leads to uninterruptable kernel hang #372

JobJob commented Oct 11, 2015

stevengj commented Oct 11, 2015

JobJob commented Oct 12, 2015

stevengj commented Oct 12, 2015

JobJob commented Oct 12, 2015

Large amounts of output leads to uninterruptable kernel hang #372

Large amounts of output leads to uninterruptable kernel hang #372

Comments

JobJob commented Oct 11, 2015

stevengj commented Oct 11, 2015

JobJob commented Oct 12, 2015

stevengj commented Oct 12, 2015

JobJob commented Oct 12, 2015