-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Mach-O parsing in Mach-O module #1263
Conversation
Make sure "number_of_segments" is set before it is used.
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it! |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
Looks good to me. As far as I remember, the command order is not guaranteed in Mach-O so I would keep two loops. |
@PeterMatula Do you know if this problem also exists in my branch here? I've re-organized the Mach-O code in that branch and I'm curious if the issue still exists. Update: Just re-read your description more closely and it looks like I would probably want to make this same change in my branch as well. |
Btw, I don't think that that |
@PeterMatula I updated your bug fix to work on the updated mach-o parsing code that was just merged. I also included a test case in that PR. |
@PeterMatula , yes, you are correct. Your bug fix should be committed now. Good find and fix! |
Great, thanks for getting it to |
The problem
In
MACHO_PARSE_FILE()
there is a command parsing loop. When segment commands are parsed,seg_count
is incremented. At the end,number_of_segments
integer is set. But also theLC_UNIXTHREAD
andLC_MAIN
commands are parsed in the loop. Their parsing functions use uninitializednumber_of_segments
value - inmacho_rva_to_offset()
andmacho_offset_to_rva()
. This is clearly not safe, but it does not cause problems most of the times. If the binaries are not malformed, rva <-> offset conversion succeeds even whennumber_of_segments
is some (random) huge value, because it hits in the first few good segments. However, it the binary is not so correct, this can cause iteration over up touint64_t
max value.The example
Run YARA (compiled with
--enable-macho
) on the following file test.zip with the following ruleset:If you are not (un)lucky, this will take a very long time because there is some huge number in the uninitialized
number_of_segments
. E.g.18445260673521474303
in my case.The solution
A 2-step commands processing:
Cons: two steps.
Pros: value is initialized, and is a general solution.
Less general solution would be to keep the single loop and set
number_of_segments
after every segment command is processed. This would however rely on specific ordering of commands - i.e. commands using segment N go only after command for segment N itself. Otherwise it would not work. I don't know if this can be expected, so its better to be safe than sorry and loop twice.@mbandzi can you please take a look at it? @metthal said you are the original author.