-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Primus Lisp enhancements #798
Merged
ivg
merged 21 commits into
BinaryAnalysisPlatform:master
from
ivg:primus-lisp-enhancements
Mar 19, 2018
Merged
Primus Lisp enhancements #798
ivg
merged 21 commits into
BinaryAnalysisPlatform:master
from
ivg:primus-lisp-enhancements
Mar 19, 2018
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The jumping observation is made just before a jump is taken and has two parameters: jump condition and jump destination. This is a much more precise and useful observation than the `{enter,leave}-term` one, as it doesn't loose the origin of values that define the jump (i.e., we can track (taint) both the condition and the destination). We also added three functions to the Linker interface, mostly for convenience. These functions resolve code names into addresses, symbolic names, and tid.
So far the plugin loads some stuff during the link time, some after the configuration is available, and the rest during the analysis phase when it is invoked as a pass. It may even be invoked twice. Now it is rectified. Everything that doesn't require an access to the project data structure is loaded automatically just after the configuration phase has finished. The rest (e.g., linking with the program) is done during the analysis.
if the parameter is not set, then the allocated memory chunk is not filled (i.e., filled with random values), if it is set, then we fill the chunk with the specified value. The motivation, is to make calloc more efficient.
and applies the reindexer to all runtime definitions.
previously only definitions were printed
To be in line with Common Lisp we've decided to put docstring and declarations after the value of defparameter and defconstant. This indeed looks more natural.
1. Provides an efficient stack that now is equipped with a map data structure that allows us effeciently query for locally bound variables. In Primus Lisp variables live in two scope: 1. the Primus global such as CPU registers, parameters, etc. 2. the Primus Lisp local scope that is populated by lexically scoped local variables such as function parameters and let bound variables. Any Lisp operation on a variable involves checking whether this is a local variable or global variable. Since we're using lexical scoping we should be able to distinguish between them in compile time. And that's what we will eventually do in the future. But so far it requires to many changes to the interpreter. Previously, we were using an assoc list, that was traversed every time we read or set a variable. The new implementation uses a balanced tree that maps variables to the total number of their occurences on the stack. That makes the check to be logarithmic in the number of unique variable names on the stack. Few other small optimizations: a) faster function frame costructor, that is not monadic (as it shouldn't be), and that is not preserving the order of arguments (so that we can append them in the reversed order, using fold left), and that counts the total number of arguments, so that we don't call List.length when we're poping the frame off. b) a faster frame pushing mechanism - instead of just appending arguments, we are prepending them in the reverse order. This is not only faster, but since the frame is in the reverse order already, it preserves the existing order (so no changes in semantics) 2. Fixes an nasty bug in the signal dispatcher. When a signal is dispatched to several methods each method ignores effects from other methods as it resets the state as it was before the dispatcher was entered.
1. we do not want primitives, that were registered by other components to be dropped when a new program is linked. In fact, primitives is a complete different beast than other definitions, as they are defined by OCaml modules (i.e., plugins) rather than by the Lisp code, so probably they shouldn't be a part of the program data structure at all. Though having them in the program makes things much easier. Besides, when we drop primitives during the program linking time, we are discarding all primitives that were registered prior to the call to program_link. So far, just by accident, the link_program function was called before any other components (since it was registered last - after all other lisp components). This is so fragile, that I consider this is a bug. Nasty one. 2. The `init_env` function in the Primus.Lisp module was adding special variables of the form `%<id>` and `@<id>` that hold addresses of the correspondingly named program terms. It was using the local variables stack. Neither should it be using the local stack for storing this variables (as they are globals by their nature), nor this should be done during the linking time of the program. This function is now moved to the loader (that is the component that's responsible for setting up the environemnt).
parameters are lazy, and are not evaluated unless are used, and not set before.
This implementation adds the following features, that are useful during the analysis: - the upper limit to the maximum size of the allocated memory chunk - the upper limit to the maximum size of the malloc heap - malloc guards with optional coloring - efficient calloc that doesn't a) take memory b) take time
There are two entities in Primus Lisp that could be evaluated - functions and methods. Both are now runable directly from OCaml via the `eval_fun` and `eval_method`.
so far we can generate only index, but still not that bad.
- merges different definitions of the same entity - removes quotations - removes duplicating whitespaces
an address can have multiple addresses, but a name shall have only one, not vice verse
gitoleg
approved these changes
Mar 19, 2018
ivg
added a commit
to ivg/bap
that referenced
this pull request
Apr 4, 2018
One of the previous commits [1] in PR BinaryAnalysisPlatform#798 was claiming that we do not need to preserve the order of the arguments, so we can provide a more efficient frame allocation. Apparently, we do need to preserve the order, at least of the proper signaling the `call` and `call-return` messages. This commit reversed the order of the arguments, since they were reversed. [1]: BinaryAnalysisPlatform@6d5afcd
ivg
added a commit
that referenced
this pull request
Apr 9, 2018
* skips evaluations of the mem variable in the store operation since this variable doesn't have any meaningfull value * fixes the cleanup procedure after term evaluation in case of abnormal termination the cleanup procedure wasn't called. * ignores return statements this is a big change actually, and we will return to it later, to make it more robust. If we won't ignore them, then we will have double returns, the first one when a function is returned based on the IR return statement (if such is present and is well defined) and the second one, when function exec finishes and we finally call the return destination. That's was a disaster, and this is the fast solution. I will evolve it later * adds machine-switch and machine-fork observations also simplifies swith and fork operators, for some reason we were holding a machine identifier in the stored continuation, probably this is a reminiscent of the debugging stage, in any case we dont need it anymore. * halts machine after switch ion the greedy scheduler it shouldn't have any observable effect, though it is better to make it explicit. * disables call bypassing When we make a call we add a failsafe machine that will resume the computation in case if the call doesn't return. We shouldn't use this failsafe machine if everything went allright. * handles empty blocks with branches correctly * few cosmetic changes * fixes the parameter order in linked lisp stubs One of the previous commits [1] in PR #798 was claiming that we do not need to preserve the order of the arguments, so we can provide a more efficient frame allocation. Apparently, we do need to preserve the order, at least of the proper signaling the `call` and `call-return` messages. This commit reversed the order of the arguments, since they were reversed. [1]: 6d5afcd * fixes strncpy, strncmp and adds strpbrk summaries * parametrizes memory allocator with the zero-sentinel that's a value that is returned when malloc is called with the zero argument. * implements a buffer overflow check This check is a part of the memcheck check suite and it verifies that functions from the string.h API are called correctly. The correctness property is the following, if the pointer to the begining of a string that is passed to the function, belongs to some heap region, then the pointer to the end of this string must belong to the same heap region
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR provides various enhancements to the Primus Lisp subsystem that aim to make it faster and easier to use.
Documentation generator
This PR provides a very crude and fast coded autodoc generator, that dumps docstrings in the dot format (that I manually translate to html using emacs), example could be found here. This is very preliminary, and only the API index is generated (no module/aka features documentation). Anyway, I believe it is much better than nothing.
Makes the Primus Lisp interface public
Basically, the whole interface is just two functions:
eval_fun
andeval_method
. Now it is possible to run Lisp programs without actually having the IR.Adds runtime parameters
This is a new feature in Primus Lisp. Now a module can specify parameters using the
defparameter
form. Parameters are like any other global variables, except that they could be documented, and have the default form. The default form is not evaluated until the parameter is used. It is also is not evaluated if a parameter is set by a user. Probably, the best place is to initialize the parameters is in theinit
method.Enhanced
malloc
ModelUsing the new parameters mechanim we enhanced the malloc model with several new features:
The new implementation is also using a more efficient calloc model with the O(1) memory and time complexity (unlike previous O(N)).
More precise control flow observations
A new
jumping
observation is much more precise and easier to use than the existingenter/leave-jmp
as it operates on the lower level and doesn't loose the information about the value origins. This event also made it possible to remove the TCF precondition from the Primus Lisp or Run modules/passes. (We still leave the TCF pass as a dependency though, as although it is not necessary it still makes analysis easier)Bug fixes
Makes Primus Lisp easier to bootstrap
Previously some passes should be run to initialize the Primus Lisp subsystem. That was quite unclear how all machinery interacts and what is the task of each pass. Now this is rectified, as all Primus Lisp plugins perform their effects during the load time, so it is only necessary to do
Plugins.run ~provides:["primus"] ()
to get the Primus subsystem up and running... err not running fortunately, but ready to be run.Implementation optimization
This PR proposes a more optimal implementation of the Primus local variables stack and made it O(logN) instead of O(N) as well as removed unnecessary stuff from it, to keep it lighter.
For further details please consider reading the commit messages, and thanks for reading and reviewing.