-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Symbolic stack traces #23
base: master
Are you sure you want to change the base?
Conversation
* Add module, class and functions comments. * Rename StackTraceInformation.Pretty_print() to prettify() since it doesn't print anything.
This happens notably in the stack trace during process creation. rename local variable to a more meaningful name.
Hi @neitsa, really impressive work! this is a great addition to the library. The bug in procmon that they give the wrong function name with a large offset in is very interesting, I always thought it was because they use a wrong PDB, good to know we do something better than procmon itself here I went through the code and I have a few comments before I dive deeper (ping me when you're ready after cleaning):
overall looks good, thank you for contributing! |
Hey Ely ! Thanks a lot for your feedback!
Thanks a lot!
Yep strange... I got the basics by looking at ProcMon statically and a bit of debugging, but still don't get why it got it kind of wrong...
Great! I also looked at retrieving the two DLLs from a Visual studio installation but it proved more complicated than I thought since VS installation has changed over time and the various VS SKUs - Enterprise, Community, etc. - have different installation scheme. I'll put some indications on how to retrieve them manually in the documentation.
Yes! Basically,
Damn :p Thank you! The check you proposed looks definitely fine - I guess we could go with just "System" as an executable called "system" would have a full path and would be called "system.exe" (with the extension) - but, better be safe than sorry ^^. That's the kind of stuff that should be picked by thorough testing, which I haven't really put in motion yet for this PR...
Of course, I totally understand! It's just I'm so used to what Python 3 offers, that I'm not proficient with Python 2 anymore. For the type annotations, I think most of tools understand them if they are placed as comments right below the function declaration as proposed in PEP 484. I can do that for most of the PR code. Although, I don't know what I can use in replacement of After further debugging, I think I still have a thing to change right now. The last call to SymGetSourceFileW (which is meant to return the fully qualified path of the source file, if any) seems to be superfluous in most conditions since the previous call to SymGetLineFromAddrW64 returns a fully qualified path in my case... Going to add a check about that right now! |
Add a few debug lines to help possible future problems resolution.
…h with SymGetLineFromAddrW64.
Thanks for the clarifications! let me know when I can start reviewing the code |
* Remove most of f-strings format. * Add a few missing comments. * Rewrite some typing comments.
@eronnen : the PR code is "finished" in its current state (by that I mean it's working as intended with all the manual tests I could do). A few stuff I couldn't really do:
Currently missing tests. I'll wait for your feedback to start adding them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a few changes requests :)
by the way I'm not sure if this feature is really testable due to the fact that Procmon is wrong about most of the symbols, maybe only sanity to check that the module names are matching... also small PMLs are needed because the exported XMLs that contain the symbols tend to be huge 😅
|
||
if sys.version_info >= (3, 5, 0): | ||
import typing | ||
import pathlib # TODO: to be converted to py 2.7 equivalent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add pathlib to setup.py
requirements (I think it has a package for 2.7 too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me. Do you have a stance concerning adding new packages? I find pathlib
to be so much easier to use than its os
counterpart, but that would require a new dependency, which you might frown upon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can keep it, it's supported in python 2 so I don't see anything wrong here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just keeping this one open until everything works correctly on py 2.7 :)
* Fix callback documentation in constructor. * Add a more meaningful error message if DLLs can't be found.
…nit and clean up.
@eronnen Working on fixing the last bits to support py 2.7. Problem found around supporting the enumeration (in dbghelp.py):
Haven't found a way to make |
@neitsa maybe instead of |
Hi All. Love this tool for configuration file editing and noticed this PR. I am guessing part of your symbols off by one issue is related to just my code markers https://github.com/mitchcapper/ProductionStackTrace/blob/enhancements/ProductionStackTrace.Analyze/SymbolLoader.cs#L177-L181 between that and some auto generated state machines getting to a proper looking stack trace can take a bit of unwinding. |
Hi Ely 👋
This is a work in progress PR for symbolic stack trace (should close #4).
First of all, the whole PR is in python 3 as I haven't touched Python 2 since 15 years ago... (I'm also not used to
six
). Please let me know if it's a showstopper.Design Decisions
Symbolic resolution relies on
dbghelp.dll
andsymsrv.dll
so it's only usable on Windows systems. Thedbghelp.dll
that ships with Windows is often an old version which may not be really useful as it lacks some functionalities. Windows doesn't shipsymsrv.dll
- which only comes with Debugging Tools For Windows (part of the Windows SDK) - and which is used to download symbol files (*.pdb
or binaries) from the Microsoft Symbol Store.The stack trace resolution (symbolic solving) goes like this:
The
SymbolResolver()
constructor does it best to find the two aforementioned DLLs by looking for the Debugging Tools for Windows or Windbg Preview installations (seeDbgHelpUtils
insymbol_resolver.py
class which contains a few static functions).The problem happens when it cannot find the two DLLs, in which case the end user will have to provide the path to a directory containing both of them.
Which brings me to the main design decision of why the resolution doesn't happen within
Event.stacktrace
(it would have been nice to haveEvent.stacktrace.resolve()
). Since passing the two DLLs down the classes instances (fromProcmonLogsReader
toEvent.stacktrace
) was too complicated, I chose to have an external resolver, to which you pass the stack trace to resolve.Output
Here's a stack Trace from ProcMon:
Here's a stack trace from the PR:
ProcMon's stack trace is kinda wrong
(and I don't know exactly why)
Here's the output - from the PR code - from the first frame (Frame 0) in the stack trace:
Here's the output from ProcMon (visible in the above screenshot):
Kernel module bases (from the PML trace):
With a kernel debugger attached to the machine that performed the trace:
So the address to resolve is
0xfffff80513d2648c
:and the correct output is
FLTMGR!FltpPerformPreCallbacksWorker+0x36c
; technically the output from ProcMon is kinda right, but not really right:🤷
ProcMon's Symbolic Source Information is kinda wrong
I did a dummy test with an executable that performs a
CreateFileMappingW
:(Notice the call happens at line 44 in the source).
Here's the output from ProcMon (visible in the above screenshot):
ProcMon got it wrong because (for a reason that escapes me) uses the previous code line from the source, rather than the actual line...
Stuff Left To Do
The code is pretty much complete at the moment, there are a few TODOs left in the code, but I wanted to get your feedback before cleaning up the code. I'm planning to add a bit of documentation (in markdown) on how to get it working.
I guess there will also be some corner cases about which I didn't think of, but it's working quite OK from what I can tell.
Let me know what you think (and sorry for the wall of text 😄 ).
Just to keep track of things:
__main__
fromsymbol_resolver.py
.symbol_resolver.py
.__post_init__
inStackTraceFrameInformation
(it was used just for debugging a problem).System
pseudo process.SymGetLineFromAddrW64
: there's no point in callingSymGetSourceFileW
in this case.SYMOPT_DEBUG
handling code to debug problems with symbol handling at dbghelp internal level.