-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dynamic: Time Travel Debugging (TTD) integration #1649
Comments
hey @atxr! I am very keen on integrating TTD with capa. Like you said, the technology might make it easier to analyze samples that are packed. It's an interesting idea to specify a collection of hooks/events at which point to analyze the state of the process and find capabilities. This makes me think of using capa against a memory dump (a good idea, but something we don't have today). So, I think this is feasible, and would enable some complementary enhancements. I'm not yet convinced of the proposal to extend the rule format to specify the hook locations. This sort of thing seems orthogonal to the description of a capability. That is, the author describing how to find browser cookie stealing behavior shouldn't have to be an expert in VMProtect and decide where to hook. Instead, I'd recommend this be provided either as a CLI argument (in the case of the TTD cursor ID) or use some reasonable defaults (like ExitProcess, WriteFile, etc.). For awareness, I had previously been thinking that we'd use TTD as a sort of sandbox that we can use to capture the API trace and feed into the dynamic analyzer that @yelhamer is working on. I think this idea can be independent of what you propose here and we should explore both. |
I think capa + TTD could be amazing! We've also discussed this before with @xusheng6, so tagging him here. |
Thank you for your feedback! I totally agree with your proposal! If I summarize a bit this TTD integration:
In the end, it should look like: capa sample.exe sample.run # analyze TTD trace sample.run with the default hooks
capa sample.exe sample.run --ttd-hook ntdll!NtCreateUserProcess # specify a hook
capa sample.exe sample.run --ttd-cursor 100:1A # specify a cursor position I have few questions though regarding the implementation:
I'm still trying to familiarize myself with the project to figure out how I'll integrate TTD, I might come with other questions soon 🙂 |
Both of these sound great, and so do the proposed command lines.
I think it should look more like the Binary Ninja feature extractor that @xusheng6 added in #1343. That's because I'd recommend that you focus on static analysis of memory snapshots, not dynamic analysis API traces. Conceptually, capa static analysis is things like functions/basic blocks/instructions while its (proposed) dynamic analysis is things like API calls found in threads and processes. In this issue, lets focus on static analysis of memory snapshots derived from TTD traces. I'll open another issue (#1655) to track the use of TTD for dynamic analysis. If you'd prefer to work on that feature, no problem! (Though, I'd suggest we wait until the CAPE implementation is done and lessons are learned.) Given that the idea is to have capa analyze snapshots of memory at specific points of time in TTD traces, I wonder if we can start by building:
These could be built in parallel as temporarily separate utilities. Then we can wire 1 and 2 within capa and add the CLI arguments, etc. The benefit is that we might also be able to provide memory snapshots from other systems, like sandboxes, which would be neat. I also suspect the TTD memory snapshot exporter might be generally useful for other things like dumping unpacked executables. I'm just brainstorming here. What do you think? |
added #1654 to track static analysis of memory snapshots |
added #1655 to track dynamic analysis via TTD traces. |
First of all, I think your ideas are really great!
Just to be sure, should this snapshot exporter be part of |
I think this can be up to you. If you can find other consumers for the library, then maybe it makes sense to be external. Or maybe capa is just a good central place to store and distribute this. shrug. It's also fine to start in your own repo and then merge into capa when you're happy.
Great! I'm also going to do a bit of background research on what we'd need to implement a memory snapshot feature extractor. At least so I can talk intelligently with you about it, and/or to write code alongside you :-) |
Awesome! Then I'll start with an external repo and see next if it makes sense to merge into capa! |
FYI I think you could use the programmatic api to instrument and run the capas
https://github.com/microsoft/WinDbg-Samples/tree/master/TTD/LiveRecorderApiSample |
Summary
Develop a
TTD exctractor
and add keywords to the rules to use trace files generated by TTD to improvecapa
dynamic analysis and defeat packers.Motivation
Because
capa
is trying to develop some dynamic analysis features, I would like to suggest using Microsoft TTD. Thanks to TTD, you can generate trace files that record the context of the binary at each instruction.I could develop a
TTD extractor
that would add new features tocapa
from the trace file.In the end, one could scan a binary sample with a TTD trace and use new rules to select a position in a trace where
capa
should work.The
TTD extractor
would need time indicators to know when to scan in the timeline. These indicators can be TTD cursors (time position) or functions that would be hooked in the trace.Also, the extractor would require a memory range to scan. Hence, several optimizations can be developed like scanning the heap, the module memory, the stack...
Here is a quick look of what a rule could look like:
This rule tries to detect thread and process creation, and scan the heap at these time positions to search some shellcode that could be loaded by a packer, or some useful strings loaded dynamically.
Alternative projects
I'm currently working on https://github.com/airbus-cert/yara-ttd which aims to apply yara rules on TTD trace files thanks to these TTD bindings.
The tool is currently working and has many use cases when dealing with yara rules on a packed binaries.
I also read the dynamic-feature-extraction branch you are working on to integrate CAPE.
The TTD integration could work alongside this project to provide a more precise analysis and an in-depth dynamic memory scan.
I didn't started yet to implement the extractor because I wanted some validation from the
capa
team before. Of course, I will welcome any kind of advice!The text was updated successfully, but these errors were encountered: