Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate StackTraces.jl into Base #14469

Closed
wants to merge 9 commits into from
Closed

Conversation

spurll
Copy link
Contributor

@spurll spurll commented Dec 22, 2015

StackTraces.jl is an external package that provides an immutable StackFrame type that simplifies interacting with a stack trace. This pull request would integrate this module into Base, as per #14212. Some refactoring has been done to avoid duplicating some functionality from the Profile module.

julia> using StackTraces

julia> stacktrace()
2-element Array{StackTraces.StackFrame,1}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some or all of these examples should maybe be doctests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a fair amount of trouble coming up with a way to make good doctests for most of these. Because the doctests aren't run from the REPL, every single stack trace generated by the doctests will be different from stack traces generated by the user. For example:

File "manual/stacktraces.rst", line 172, in default
Failed example:
    grandparent()
Expected:
    ERROR: Whoops!
    StackTrace with 5 StackFrames:
      child at none:1
      parent at none:1
      grandparent at none:3
      eval_user_input at REPL.jl:62
      [inlined code from REPL.jl:92] anonymous at task.jl:63
Got:
    ERROR: Whoops!
    StackTrace with 8 StackFrames:
      child at none:1
      parent at none:1
      grandparent at none:3
      eval at boot.jl:265
      [inlined code from sysimg.jl:14] eval_user_input at client.jl:96
      eval at boot.jl:265
      [inlined code from sysimg.jl:14] eval_user_input at client.jl:96
      _start at client.jl:338

I can change the stack trace examples in the documentation to match what will happen when make -C doc doctest is run, but I worry that this would make the documentation itself less useful to the user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's supposed to be some filtering out of those last few lines, but it may have been broken since the big inlined-line-numbers change a few months back.

@spurll
Copy link
Contributor Author

spurll commented Dec 23, 2015

Thanks for the feedback. Will make those adjustments. Looks like I also need to figure out why 32-bit Windows is behaving differently, too...

@tkelman
Copy link
Contributor

tkelman commented Jan 4, 2016

Generally rebases are also cleaner to review than merge commits, and help avoid tangling the history for future bisects or blames.

@omus
Copy link
Member

omus commented Jan 6, 2016

I've done some looking into why the Windows 32-bit build is failing and it appears that @noinline isn't be respected in the test cases. I haven't managed to reproduce this failure outside of Appveyor. Does anyone know how to update the LLVM version in Appveyor?

@tkelman
Copy link
Contributor

tkelman commented Jan 6, 2016

You'd have to build and upload an llvm binary, then modify contrib/windows/msys_build.sh to use it. I was planning on preparing this soon for switching master over to llvm 3.7.1 (+patches) so maybe the problem will go away?

@omus
Copy link
Member

omus commented Jan 6, 2016

Thanks @tkelman. I'm not sure that the problem will just go away with a newer LLVM but it is definitely worth looking into.

@spurll
Copy link
Contributor Author

spurll commented Jan 11, 2016

Got most of the doctests working using the ellipsis, but @omus and I ran into what seems to be some inconsistencies in the way backtrace (via stacktrace) reports line numbers.

While catch_backtrace always seems to return the correct line number, backtrace doesn't. (Examples below use stacktrace because it's a bit more readable, but I tested them with backtrace too.)

In this example I would expect backtrace/stacktrace to return line 5, the line where it is executed in the catch, but instead it returns line 3, the line number in the try block(which is kind of what we expect from catch_backtrace instead, although it doesn't have a frame for bad_function):

julia> @noinline bad_function() = nonexistent_var
bad_function (generic function with 1 method)

julia> @noinline function good_function()
          try
               x = bad_function()
          catch
               return stacktrace()
          end
       end
good_function (generic function with 1 method)

julia> good_function()
4-element Array{Base.StackTraces.StackFrame,1}:
 Base.StackTraces.StackFrame(:good_function,:none,3,symbol(""),-1,false,13140383748)                     
 Base.StackTraces.StackFrame(:eval,symbol("./boot.jl"),265,symbol(""),-1,false,4494236952)               
 Base.StackTraces.StackFrame(:eval_user_input,symbol("REPL.jl"),3,symbol("REPL.jl"),62,false,13140281135)
 Base.StackTraces.StackFrame(:anonymous,symbol("REPL.jl"),92,symbol("task.jl"),63,false,13140230754)     

In the next example, adding a second line to the try block seems to increment the line number returned in the stack trace as well, causing it to point at line 4 (a line that should never be executed):

julia> @noinline function good_function()
          try
               x = bad_function()
               x += 1
          catch
               return stacktrace()
          end
       end
good_function (generic function with 1 method)

julia> good_function()
4-element Array{Base.StackTraces.StackFrame,1}:
 Base.StackTraces.StackFrame(:good_function,:none,4,symbol(""),-1,false,13140385322)                     
 Base.StackTraces.StackFrame(:eval,symbol("./boot.jl"),265,symbol(""),-1,false,4494236952)               
 Base.StackTraces.StackFrame(:eval_user_input,symbol("REPL.jl"),3,symbol("REPL.jl"),62,false,13140281135)
 Base.StackTraces.StackFrame(:anonymous,symbol("REPL.jl"),92,symbol("task.jl"),63,false,13140230754)

Finally, adding another line to the catch block results in backtrace/stacktrace reporting the line number I expect:

julia> @noinline function good_function()
          try
               x = bad_function()
               x += 1
          catch
               println("Something.")
               return stacktrace()
          end
       end
good_function (generic function with 1 method)

julia> good_function()
Something.
4-element Array{Base.StackTraces.StackFrame,1}:
 Base.StackTraces.StackFrame(:good_function,:none,7,symbol(""),-1,false,13140386475)                     
 Base.StackTraces.StackFrame(:eval,symbol("./boot.jl"),265,symbol(""),-1,false,4494236952)               
 Base.StackTraces.StackFrame(:eval_user_input,symbol("REPL.jl"),3,symbol("REPL.jl"),62,false,13140281135)
 Base.StackTraces.StackFrame(:anonymous,symbol("REPL.jl"),92,symbol("task.jl"),63,false,13140230754)

In all of these examples, catch_stacktrace reports the correct line number (line 3), with an additional frame for bad_function as well, as expected.

I was hoping that you might be able to shed some light on how these line numbers are generated, @tkelman. It seems like some under-the-hood optimizations are probably interfering with correct line numbers being reported by backtrace in some cases.

@ivarne
Copy link
Member

ivarne commented Jan 12, 2016

Maybe I'm missing something obvious, but why do you have show_stacktrace instead of show(io::IO, s::StackTrace).

@spurll
Copy link
Contributor Author

spurll commented Jan 12, 2016

When the original StackTraces.jl module written, there was a separate show_backtrace function in Base and I was mirroring that. This function seems to have disappeared, and I believe that you're right: providing a show method would be better.

@spurll spurll force-pushed the master branch 2 times, most recently from be2dfff to f8c7165 Compare January 14, 2016 19:24
line_numbers = @__LINE__ .- [15, 10, 5]

# Test try...catch with stacktrace
@test try_stacktrace()[1] == StackFrame(:try_stacktrace, @__FILE__, line_numbers[2])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that only one issue remains unaddressed. As mentioned earlier, there are some inconsistencies with the way backtrace reports line numbers, which cause the test case on line 72 (and the corresponding doctest) to fail. Despite the @noinlines, some sort of optimization seems to be occurring which causes a line number in the try block to be reported for code run in the catch (or perhaps something else entirely is going on; this is just a guess).

The test case could be modified to succeed by replacing lines 54 through 69 with something like:

    @noinline good_function() = 15 / pi
    @noinline bad_function() = throw(UndefVarError(:nonexistent))
    @noinline function try_stacktrace()
        try
            good_function()
            bad_function()
        catch
            good_function()
            return stacktrace()
        end
    end
    @noinline function try_catch()
        try
            good_function()
            bad_function()
        catch
            good_function()
            return catch_stacktrace()
        end
    end
    line_numbers = @__LINE__ .- [19, 12, 6]

For reasons that I do not fully understand (see my earlier comment) this causes the line numbers to be reported correctly by backtrace—but modifying the test case in this way seems like sidestepping the issue, and the tests ought to pass as they are.

@spurll spurll force-pushed the master branch 3 times, most recently from 9275963 to 9700d61 Compare January 20, 2016 17:34
@JeffBezanson JeffBezanson added this to the 0.5.0 milestone Jan 27, 2016
@spurll
Copy link
Contributor Author

spurll commented Jan 28, 2016

The Travis/AppVeyor failures could be addressed by any of the following:

  1. Identifying (and correcting) the cause of the misreported line number in Base.backtrace.
  2. Flag the offending test case as an expected failure; this is easy to do in Python, but Julia doesn't seem to support this out of the box (I suppose I could roll a custom handler for that, if necessary).
  3. Adding additional irrelevant lines to the test case (as mentioned above), which causes Base.backtrace to report the correct line number.
  4. Removing the offending test case entirely.

My preference would be option 1, followed by option 2. But because my understanding of the underlying C code is limited (to be generous), I'm not planning to attempt a fix the line number issue myself.

@omus
Copy link
Member

omus commented Jan 28, 2016

The last issue definitely seems related to #13947. In this case the backtrace isn't missing just the line numbers are being reported incorrectly. It appears that the line that does get reported is the line of the last expression in the try.

# file example.jl
function example()
    try
        error()
        false  # Line 5
        # Ignored
    catch
        display(stacktrace())  # Line 8
    end
end

example()
$ ./julia example.jl
5-element Array{StackFrame,1}:
 example at st.jl:5                  
 include at boot.jl:262              
 include_from_node1 at loading.jl:401
 process_options at client.jl:249    
 _start at client.jl:301      

You can correct the line number by introducing an expression prior to running stacktrace:

# file example_works.jl
function example()
    try
        error()
        false  # Line 5
        # Ignored
    catch
        true
        display(stacktrace())  # Line 9
    end
end

example()
$ ./julia example_works.jl
5-element Array{StackFrame,1}:
 example at example_works.jl:9      
 include at boot.jl:262              
 include_from_node1 at loading.jl:401
 process_options at client.jl:249    
 _start at client.jl:301             

Added doctests to the manual.
Removed show_stacktrace in favour of show(x::StackFrame).
Removed format_stacktrace and format_stackframe.
Additional test cases (which currently fail due to difficult to track
down line number issues)
Reintroduced UNKNOWN constant stack frame from profile.jl.
Added test case for looking up an invalid stack frame.
Selectively removed some tests that consistently fail in Appveyor's 32-bit
Windows LVM.
spurll and others added 2 commits January 28, 2016 16:49
Currently a call to `stacktrace()` at the beginning of a catch will
result in incorrect line numbers being returned. This is a problem with
`backtrace()`.
@vtjnash
Copy link
Member

vtjnash commented Jan 28, 2016

subsumed by #14815

@vtjnash vtjnash closed this Jan 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants