Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enables writing program semantics (lifting) in Primus Lisp #1277

Merged
merged 1 commit into from
Feb 24, 2021

Conversation

ivg
Copy link
Member

@ivg ivg commented Feb 23, 2021

Introduction

It is now possible to specify program semantics, including
semantics of instructions (aka lifters) and even semantics of the
whole programs in Primus Lisp.

Essentially Primus Lisp programs could be compiled into program
values, i.e., core theory terms. The main motivation for this work was
writing lifters and stubs in Primus Lisp, which significantly reduces
the effort wrt writing a lifter directly in OCaml. For example, it
took me less than a day to write a RISCV lifter in Primus Lisp (the PR
is upcoming).

The feature is seamlessly integrated with the knowledge base that
makes it available not only for the new architectures but for
extending the lifters for existing architectures. Whenever a semantics
of a basic instruction is requested, a corresponding Lisp definition
(which has name that matches the opcode name prefixed with the opcode
encoding and the corresponding number of arguments) is sought and if
found its is reified into a Core Theory term.

The Primus Lisp language itself underwent a few deep changes and is now a
meta language for writing programs that write programs. Its Lisp
nature makes it an ideal choice for writing lifters thanks to
mono-iconic representation of the host and target languages. The
Primus Lisp meta compiler is an interpreter that partially evaluates
Primus Lisp programs until no further refinements are possible. This
results in a very-specialized minimal code. To illustrate this, let's
consider the following example (assuming that it is stored in the
demo.lisp file that is located in the current folder),

(defun example1 ()
  (set R0 (/= 1 2 3)))

Since the value of (/= 1 2 3) is statically known we reify it into
an efficient representation,

$ bap show example1 --primus-lisp-load=demo -tarmv5+le -obap:bil
example1:
"{
   R0 := 1
 }"

We still produce an assignment because the target is armv5+le, which
tells us that R0 is a register. Let's consider an example where we
can't compute the input statically,

(defun example2 ()
  (set R0 (/= R1 1 2 3)))

which is refied into,

$ bap show example2 --primus-lisp-load=demo -tarmv5+le -obap:bil
example2:
"{
   R0 := R1 <> 1 & R1 <> 2 & R1 <> 3
 }"

The meta-compiler supports partial computations for all forms, so it
is possible to use let forms (with lexical and dynamic scoping) and
define helper functions and macros, e.g.,

(defun example3 ()
  (let ((x (average 1 2 3)))
    (set R0 (+ (average 10 R1 30) x))))

(defun average (x y z)
  (/ (+ x y z) 3))

will be reified to

$ bap show example3 --primus-lisp-load=demo -tarmv5+le -obap:bil
example3:
"{
   R0 := (R1 + 0x28) / 3 + 2
 }"

Such mono-iconic representation is possible because the value of a
Primus Lisp expression is no longer a word, but a program term, which
is represented as the knowledge base value of class
Theory.Effect.cls. This value is also known in Bap.Std as
Insn.t, coincidentally the type of a value that is expected to be
provided by lifters.

Introduction
============

It is now possible to specify program semantics, including
semantics of instructions (aka lifters) and even semantics of the
whole programs in Primus Lisp.

Essentially Primus Lisp programs could be compiled into program
values, i.e., core theory terms. The main motivation for this work was
writing lifters and stubs in Primus Lisp, which significantly reduces
the effort wrt writing a lifter directly in OCaml. For example, it
took me less than a day to write a RISCV lifter in Primus Lisp (the PR
is upcoming).

The feature is seamlessly integrated with the knowledge base that
makes it available not only for the new architectures but for
extending the lifters for existing architectures. Whenever a semantics
of a basic instruction is requested, a corresponding Lisp definition
(which has name that matches the opcode name prefixed with the opcode
encoding and the corresponding number of arguments) is sought and if
found its is reified into a Core Theory term.

The Primus Lisp language itself underwent a few deep changes and is now a
meta language for writing programs that write programs. Its Lisp
nature makes it an ideal choice for writing lifters thanks to
mono-iconic representation of the host and target languages. The
Primus Lisp meta compiler is an interpreter that partially evaluates
Primus Lisp programs until no further refinements are possible. This
results in a very-specialized minimal code. To illustrate this, let's
consider the following example (assuming that it is stored in the
`demo.lisp` file that is located in the current folder),

```lisp
(defun example1 ()
  (set R0 (/= 1 2 3)))
```

Since the value of `(/= 1 2 3)` is statically known we reify it into
an efficient representation,

```
$ bap show example1 --primus-lisp-load=demo -tarmv5+le -obap:bil
example1:
"{
   R0 := 1
 }"
```

We still produce an assignment because the target is `armv5+le`, which
tells us that `R0` is a register. Let's consider an example where we
can't compute the input statically,
```lisp
(defun example2 ()
  (set R0 (/= R1 1 2 3)))
```

which is refied into,
```
$ bap show example2 --primus-lisp-load=demo -tarmv5+le -obap:bil
example2:
"{
   R0 := R1 <> 1 & R1 <> 2 & R1 <> 3
 }"
```

The meta-compiler supports partial computations for all forms, so it
is possible to use `let` forms (with lexical and dynamic scoping) and
define helper functions and macros, e.g.,

```
(defun example3 ()
  (let ((x (average 1 2 3)))
    (set R0 (+ (average 10 R1 30) x))))

(defun average (x y z)
  (/ (+ x y z) 3))
```

will be reified to
```
$ bap show example3 --primus-lisp-load=demo -tarmv5+le -obap:bil
example3:
"{
   R0 := (R1 + 0x28) / 3 + 2
 }"
```

Such mono-iconic representation is possible because the value of a
Primus Lisp expression is no longer a word, but a program term, which
is represented as the knowledge base value of class
`Theory.Effect.cls`. This value is also known in `Bap.Std` as
`Insn.t`, coincidentally the type of a value that is expected to be
provided by lifters.
@ivg ivg force-pushed the primus-lisp-lifter branch from 05c0cef to 200c025 Compare February 24, 2021 13:24
@ivg ivg merged commit 446c17b into BinaryAnalysisPlatform:master Feb 24, 2021
@ivg ivg mentioned this pull request Apr 2, 2021
@ivg ivg deleted the primus-lisp-lifter branch December 1, 2021 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant