-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
overhauls the target/architecture abstraction (3/n) #1227
Merged
ivg
merged 1 commit into
BinaryAnalysisPlatform:master
from
ivg:overhauls-targets-part-3
Oct 5, 2020
Merged
overhauls the target/architecture abstraction (3/n) #1227
ivg
merged 1 commit into
BinaryAnalysisPlatform:master
from
ivg:overhauls-targets-part-3
Oct 5, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
As a small teaser, I will finish it on Monday (need a bit more documentation and polishing, probably will add a couple more instruction to the systemz lifter). |
ivg
force-pushed
the
overhauls-targets-part-3
branch
from
October 5, 2020 13:51
80ace08
to
fb48bf6
Compare
In this episode, we liberate `bap mc` and `bap objdump` from the bonds of the `Arch.t` representation. We also add the systemz lifter for demonstration purposes. Of course, the lifter is minimal and far from being usable, but that serves well its didactic purposes. The interface of the `bap mc` command is preserved but is extended with a few more command-line options that provide a great deal of flexibility. Not only it is now possible to specify the target and encoding, but it is now possible to pass options directly to the backend, which is useful for disassembling targets that are not yet known to BAP. Below is an excerpt from the bap-mc man page (see bap mc --help) ``` SETTING ARCHITECHTURE The target architecture is controlled by several groups of options that can not be used together: - arch; - target and encoding; - triple, backend, cpu, bits, and order. The arch option provides the least control but is easiest to use. It relies on the dependency-injection mechanism and lets the target support packages (plugins that implement support for the given architecture) do their best to guess the target and encoding that matches the provided name. Use the common names for the architecture and it should work. You can use the bits and order options to give more hints to the target support packages. They default to 32 and little correspondingly. The target and encoding provides precise control over the selection of the target and the encoding that is used to represent machine instructions. The encoding field can be omitted and will be deduced from the target. Use bap list targets and bap list encodings to get the list of supported targets and encodings respectivly. Finally, the triple, backend, cpu,... group of options provides the full control over the disassembler backend and bypasses the dependency-injection mechanism to pass the specified options directly to the corresponding backends. This enables disassembling of targets and encodings that are not yet supported by BAP. The meanings of the options totally depend on the selected backend and they are passed as is to the corresponding arguments of the Disasm_expert.Basic.create function. The bits and order defaults to 32 and little corresondingly and are used to specify the number of bits in the target's addresses and the order of bytes in the word. This group of options is useful during the implementation and debugging of new targets and thus is reserved for experts. Note, when this group is used the semantics of the instructions will not be provided as it commonly requires the target specification. ```
ivg
force-pushed
the
overhauls-targets-part-3
branch
from
October 5, 2020 19:33
fb48bf6
to
7912729
Compare
ivg
changed the title
[WIP] overhauls the target/architecture abstraction (3/n)
overhauls the target/architecture abstraction (3/n)
Oct 5, 2020
ivg
added a commit
to ivg/bap
that referenced
this pull request
Feb 22, 2021
This PR is the continuation of the BinaryAnalysisPlatform#1225, BinaryAnalysisPlatform#1226, and BinaryAnalysisPlatform#1227 series of changes that were focused on substituting the old and inextensible `Arch.t` abstraction with the new `Theory.Target.t` representation. This episode is instigated by the upcoming implementation of the RISCV target. Since RISCV is the out target that is not supported with Arch.t it became a good test of the new Theory.Target.t abstraction. As the RISCV worked showed, we still have lots of code that depends on Arch.t, most importantly Primus, which was fully dependent on Arch.t. The main issue was that Theory.Target.t doesn't provide any means to encode register classes, which prevented us from using it everywhere in Primus, e.g., we need to know which register is the stack pointer in order to setup the stack. To implement this, we introduce a new abstraction called _role_. A _role_ could be generally applied to any entity but so far we are only talking about the roles of registers in various targets. The target definiton now acccepts the `regs` paramater that takes the register file specification with each register assigned one or more roles, e.g., here is the register file specification for 8086, ```ocaml Theory.Role.Register.[ [general; integer], main @< index @< segment; [stack_pointer], untyped [reg r16 "SP"]; [frame_pointer], untyped [reg r16 "BP"]; [Role.index], untyped index; [Role.segment], untyped segment; [status], untyped flags; [integer], untyped [ reg bool "CF"; reg bool "PF"; reg bool "AF"; reg bool "ZF"; reg bool "SF"; reg bool "OF"; ] ``` I.e., we assign a set of roles to a set of registers. We also now have two new functions `Theory.Target.regs` and `Theory.Target.reg` that enable querying the register file of the target for register that fulfill one or more roles. Whilst we publish a limited number of well-known (blessed) roles in the `Theory.Role.Register` module, more roles could be added as user need it. For example, in the code snippet above we have two non-standard roles that are specific to the x86 architectures, `Role.index` and `Role.segment`. With roles we can drop the dependency on Target in most of the places where it makes sense (I still left it in x86 and other target-specific plugins, which obviously are independent on the newly added architectures).
ivg
added a commit
that referenced
this pull request
Feb 22, 2021
This PR is the continuation of the #1225, #1226, and #1227 series of changes that were focused on substituting the old and inextensible `Arch.t` abstraction with the new `Theory.Target.t` representation. This episode is instigated by the upcoming implementation of the RISCV target. Since RISCV is the out target that is not supported with Arch.t it became a good test of the new Theory.Target.t abstraction. As the RISCV worked showed, we still have lots of code that depends on Arch.t, most importantly Primus, which was fully dependent on Arch.t. The main issue was that Theory.Target.t doesn't provide any means to encode register classes, which prevented us from using it everywhere in Primus, e.g., we need to know which register is the stack pointer in order to setup the stack. To implement this, we introduce a new abstraction called _role_. A _role_ could be generally applied to any entity but so far we are only talking about the roles of registers in various targets. The target definiton now acccepts the `regs` paramater that takes the register file specification with each register assigned one or more roles, e.g., here is the register file specification for 8086, ```ocaml Theory.Role.Register.[ [general; integer], main @< index @< segment; [stack_pointer], untyped [reg r16 "SP"]; [frame_pointer], untyped [reg r16 "BP"]; [Role.index], untyped index; [Role.segment], untyped segment; [status], untyped flags; [integer], untyped [ reg bool "CF"; reg bool "PF"; reg bool "AF"; reg bool "ZF"; reg bool "SF"; reg bool "OF"; ] ``` I.e., we assign a set of roles to a set of registers. We also now have two new functions `Theory.Target.regs` and `Theory.Target.reg` that enable querying the register file of the target for register that fulfill one or more roles. Whilst we publish a limited number of well-known (blessed) roles in the `Theory.Role.Register` module, more roles could be added as user need it. For example, in the code snippet above we have two non-standard roles that are specific to the x86 architectures, `Role.index` and `Role.segment`. With roles we can drop the dependency on Target in most of the places where it makes sense (I still left it in x86 and other target-specific plugins, which obviously are independent on the newly added architectures).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In this episode, we liberate
bap mc
andbap objdump
from the bondsof the
Arch.t
representation. We also add the systemz lifter fordemonstration purposes. Of course, the lifter is minimal and far from
being usable, but that serves well its didactic purposes.
The interface of the
bap mc
command is preserved but is extendedwith a few more command-line options that provide a great deal of
flexibility. Not only it is now possible to specify the target and
encoding, but it is now possible to pass options directly to the
backend, which is useful for disassembling targets that are not yet
known to BAP. Below is an excerpt from the bap-mc man page
(see bap mc --help)