-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamps primus lisp type checker #1047
Merged
ivg
merged 36 commits into
BinaryAnalysisPlatform:master
from
ivg:revamps-primus-lisp-type-checker
Feb 10, 2020
Merged
Revamps primus lisp type checker #1047
ivg
merged 36 commits into
BinaryAnalysisPlatform:master
from
ivg:revamps-primus-lisp-type-checker
Feb 10, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Tasks left to be done: 1) still an open question how to deal with user type variables 2) unresolved variables are not reported 3) unresolved function calls are not reported 4) error message really sucks (at best we have location)
A type variable is quantified over the scope of its defining expression, i.e., for each expression that has a non-ground signature we instantiate a new type variable of the form `x%m` where `x` is the name of the type variable and `m` is the id of the expression.
Well, at least they are more sane than were before :) 1) we track all types that were used during unification and dump them to users in hope that this will clarify them the issue 2) we disallow unresolved functions and variables and immediately lower the type to bottom if we see one. 3) the error messages are now more explonatory and include source code with the relevenat context and highlighting.
also, instead of opening a file, we just keep all the read data in the source repository.
it looks like that having disjoint types is not really useful right now in Primus Lisp, so we decided to switch to the classical inference with conjunctive types (i.e., with only meet operator, no join). It is still the gradual type as we allow the `any` type which relaxes the typing discipline and allows ill-typed expressions. For example, the `concat` primitives returns `any` as it is dependetly typed. In the same vain, we relax the typing of the `if` form and do not unify the branches, so it returns `any` (so it types depends on the condition value). The switch to the simple scheme enables much more readable and useful error messages, and we can still type all our lisp programs (modulo real type errors).
E.g., we should mips init, when we're checking for x86 (we will get lots of unbound errors). ``` (defun init (main argc argv auxv) "GNU libc initialization stub" (declare (external "__libc_start_main") (context (arch "mips"))) (set T9 main) (set RA @__libc_csu_fini) (exit-with (invoke-subroutine main argc argv))) ```
It was allowed to set an undefined variable, which is probably an error. why, probably? The Primus.Env semantics let's us create variables on the fly. Whether it is used or not, good or not is a second question, so far let's try to disable this on the typing level and later we will consider whether we need this in runtime.
(whether a method has a corresponding signal is not yet checked, comming soon)
It is now called automatically after components but before init and type errors are reported as observations. We don't want right now to stop processing in case of type errors, just in case if we still have some false positives.
the primitives shall not be registered conditionally
They were so far referring to unresolved primitives, now the type checker says no, so I've reimplemented them not as a primitive, but as a big overload.
1) there is no such function as coerce 2) implement atoi using the new language specific size functions.
and fixes the documentation for `when` and `until`.
Previously it was an ill-typed piece of code as we didn't pass the size. We employ parameters to set the size, and default to one word. Parameters are dynamically bound, so you can do as ```lisp (let ((*indirect-taint-size* 100)) (taint-introduce 'indirectly k v)) ```
this signal is defined in ddtbd, which is not yet integrated into BAP.
the `strchr` function takes an integer so we have to explicitly case chars to it.
It will typecheck Primus Lisp programs without the neccesity to run them (so far we need to run a binary to trigger the typechecker, which was quite unintuitive). All error messages will be printed with a conclusive message about the quality of the program :) Note, that when a program is run it will be automatically typechecked and the errors will be printed, so there is no need to use this option, every time you need to run a program. Note, for those who prefer to type check, there is also the `--primus-lisp-type-check` alias. But as usual, the least unambiguous prefix will work, e.g., `--primus-lisp-type` would suffice.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes the long-broken Primus Lisp typechecker and fixes errors that the type checker was able to find. It also adds the
--primus-lisp-type-check
so that you can check the types without running the program.Historical Context
Primus Lisp type checker was an experiment for building a soft type inference with intersecting types. It didn't work well and the error messages were useless, so it was more or less disabled. However, we have recently discovered, that (a) not all our Lisp codes are well-typed (no surprises) and (b) that ill-typed Lisp code not always evaluates to a type error, but instead exhibits weird behavior, e.g., when a lexically scoped suddenly becomes dynamically scoped. In other words, we couldn't trust our lisps anymore.
To reestablish the trust we decided to quickly fix the type checker. This (having to do this quickly) forced us to drop intersecting types and implement more or less common HM style inference (with a diverging type 'a.'a for gradual typing). The type checker still a flow-based and runs on the callgraph. We also added a couple of checks:
The error messages are now much more readable and point to the origin of the problem, not miles away. It also features nice highlighting.
To introduce more type sinks into the system (and find more type errors) we add type signatures to signals (methods are now also type-checked) and a translation of subroutine arguments into Lisp signatures (ideally, we should be able to parse header files, but so far we just rely on whatever is in program.
The fixed type-checker found a few errors, which we fixed :)