-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix imphash issue with empty import names. #1944
Conversation
If an import has an empty name skip processing it. This is consistent with the behavior of pefile (https://github.com/erocarrera/pefile/blob/593d094e35198dad92aaf040bef17eb800c8a373/pefile.py#L5871-L5872). Add a test case which is just the tiny test file with the first import name set to all NULL bytes. I tested that pefile calculates the imphash of it, which matches what YARA now calculates too: >>> import pefile >>> pe = pefile.PE('/Users/wxs/src/yara/tests/data/tiny_empty_import_name') >>> pe.get_imphash() '0eff3a0eb037af8c1ef0bada984d6af5' >>> Fixes VirusTotal#1943
The file |
Oops, I forgot to add it! Will do so as soon as I'm able. |
Please don't merge this yet. There are other problems raised in the linked issue that are being looked into. |
If an imported function does not contain ONLY a-zA-Z0-9 and a small subset of special characters it will now be ignored. This aligns us better with pefile, which checks for valid import names and skips them if they are invalid. I've also updated the test file to check that these special characters are handled properly.
Turns out that the tiny-idata-5200 file is corrupted to the point that pefile doesn't parse it. For example, it finds no imports to hash: ``` >>> pe = pefile.PE('tests/data/tiny-idata-5200') >>> pe.get_imphash() '' >>> ``` We were parsing imports from this file before these changes that were incorrect, so fix the tests to reflect the fact that we parse no imports from this file anymore. As part of this I've split the checks for number of parsed imports and successfully parsed imports into two different counters, which now means we are accurately reflecting when we are able to parse the import table but not the descriptors in it while still making sure we don't parse too many as we have seen before.
Move these declaractions so they are with the rest of the constants. This makes the output of -D easier to read.
If we have an invalid import name we need to free the name and continue to the next thunk and function. While fixing this I noticed that if we fail to alloc an IMPORT_FUNCTION* we would end up looping endlessly because we were never incrementing the thunk pointer or function index. Fix it by ALWAYS incrementing those at the end of the loop and conditionally populating the newly allocated node.
Actually, the memory leak was because of my changes. I was not calling I also discovered that if we failed to allocate a new I think this can be reviewed and merged now. |
If an import has an empty name skip processing it. This is consistent with the behavior of pefile (https://github.com/erocarrera/pefile/blob/593d094e35198dad92aaf040bef17eb800c8a373/pefile.py#L5871-L5872).
Add a test case which is just the tiny test file with the first import name set to all NULL bytes. I tested that pefile calculates the imphash of it, which matches what YARA now calculates too:
Fixes #1943