-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Signature recognition #769
base: Signature-recognition
Are you sure you want to change the base?
Conversation
…near matching sigs which then produce the same method name.
in ELF spliter the loader into 3 modules, loader, loader32 and loader64
@Cairn23: first of all, thanks for the contribution, especially of such an ambitious new feature. It will be a great addition to Reko. It is the nature of code reviews that they are critical in tone. This may be discouraging to newcomers to the project, who aren't familiar with the coding style and design of the Reko project. Please take the code review in the spirit it is offered. I encourage you to be curious about the review feedback and participate in discussion to clarify coding and design choices. Ultimately, we're all working to improve the code base. This is a large PR, with many file modifications. This makes the reviewing of the code changes more challenging. It also means that the review process will take longer, perhaps a couple of days. Please have patience as we work through the files. When submitted, a PR must be mergeable with the target branch (in this case, When submitted, a PR must furthermore pass all unit tests and pass all the tests in the regression suite (under One specific note: a big refactoring of the Thanks again, and stay tuned as I work through this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cairn23: I've added comments to this very large PR. Because of the sheer size of it, it's unlikely that I can merge it into master
immediately. Instead, please go through the changes requested and address them. If you have questions or concerns, you can reach me on gitter. Perhaps we should start a private chat there to avoid spamming the general Reko channel.
My greatest concern is that there are no tests to validate anything related to the byte signatures. Before I can merge this to master, I want to see unit tests testing:
- loading byte signatures from a (simulated) file
- saving byte signatures to a file
- generate byte signatures from an (ELF|COFF|what have you) file with symbolic information in it.
- after loading + scanning a file, apply signatures and validate that procedure names are changed.
I also need to have a discussion with you regarding how a user would generate and use these signature files. Clearly, there should be a way to use "shared" signature files -- such files would be located near the Reko executales. But I also think that there should be a way for a user to generate "custom" files and place them alongside the Reko project file.
There are other issues to consider, like performance, user configuration etc. But those will be address in other PR's.
This PR is the fruit of a lot of work, congratulations. Now it's time to polish it up!
@@ -0,0 +1,59 @@ | |||
============================ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming this file to the md
extension and use Markdown. For instance, this could be changed to:
# What are signature files?
@@ -0,0 +1,398 @@ | |||
using Reko.ImageLoaders.Coff; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also looks like duplicated code of the ar
loader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not apart of the project
string sourceFile; | ||
string destFile; | ||
|
||
Console.WriteLine("Reko static library signature generator"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is chatty. Prefer quiet tools that do their jobs inobtrusively.
@@ -0,0 +1,56 @@ | |||
#region License | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this looks like duplicated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not apart of the project
I've created a new branch |
There were 90 comments in my preliminary pass over the PR. The ball is in your court now, @Cairn23 :) |
I do not like idea to change interface of |
The conflict is noting to do with the work I have done. |
You refactored |
Merge of of Uxmal master with this branch carried out. resolved issues in ELFLoader.cs |
Pavel can you explain what you don't like about the change. Perhaps there is another way to deal with this? |
I have not looked at the PR in detail, but I think using a tree structure for matching the signatures is faster and more flexible. See for example how IDA does it (conceptually): |
@ceeac: indeed, using a trie or a suffix array to perform the pattern matching is likely to be performant. However, the scope of this PR is larger than the algorithm used to perform the actual pattern matching. It involves incorporation pattern recognition into the Reko flow, user interface changes, new file formats, etc. Before this PR can be merged there are some issues left to be resolved. There are some unaddressed review comments (see above). The file format for the signature files needs to be properly documented More seriously, there are no unit tests or regression tests covering the new code. If I wanted to refactor the code to use efficient pattern matching structures like tries or suffix trees, I wouldn't be aware that I was breaking anything. @Cairn23: what can I do to assist in completing the PR? |
You need to understand the concepts before even thing about changing the pattern matching. I have compared this to IDA's Flirt and it is the same speed. Given that that doing a full decomplication of one of my test aps, the signature process is only taking less than a 20th of the overall time, I would not worry. if you attempt to use a trie how are you going to account for wild cards, In the pre work I did with this using an array of signatures I was able to create conditions that would mean other then the first couple of bytes you could completely miss match as every byte would have a wild card option. I did start to look at the unit tests, but found that they did not add any value, the same as all the other loader unit test which were just checking the that Image loader code was working placing the data in to the header, there are nor REAL tests within any of the loaders. If you want the tests, please set the correct standard first, before going at people. |
The best thing to do would be to get it added to the master with a beta warning whilst people use it. It is easy to prevent this working by ensuring the signatures are removed. |
No description provided.