Skip to content

Commit

Permalink
Update README and documentation
Browse files Browse the repository at this point in the history
Ref. #230
  • Loading branch information
treiher committed Jun 2, 2020
1 parent 16e63da commit db0a14b
Show file tree
Hide file tree
Showing 4 changed files with 134 additions and 50 deletions.
134 changes: 97 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,44 @@
# RecordFlux

[![Build Status](https://github.com/Componolit/RecordFlux/workflows/tests/badge.svg)](https://github.com/Componolit/RecordFlux/actions)
[![Code Coverage](https://codecov.io/github/Componolit/RecordFlux/coverage.svg?branch=master)](https://codecov.io/github/Componolit/RecordFlux)
[![Python Versions](https://img.shields.io/badge/python-3.6%20%7C%203.7-blue.svg)](https://python.org/)
[![PyPI](https://img.shields.io/pypi/v/RecordFlux?color=blue)](https://pypi.org/project/RecordFlux/)
[![Python Versions](https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8-blue.svg)](https://python.org/)
[![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)
[![Tests](https://github.com/Componolit/RecordFlux/workflows/tests/badge.svg)](https://github.com/Componolit/RecordFlux/actions)
[![Code Coverage](https://codecov.io/github/Componolit/RecordFlux/coverage.svg?branch=master)](https://codecov.io/github/Componolit/RecordFlux)

RecordFlux is a toolset for the formal specification of messages and the generation of verifiable binary parsers and message generators.

## Message Specification Language
## Message Specification

The RecordFlux Message Specification Language is a domain-specific language to formally specify message formats of existing real-world binary protocols. Its syntax is inspired by [Ada](https://www.adacore.com/about-ada). A detailed description of the language elements can be found in the [Language Reference](/doc/Language-Reference.md).

## Model Verification
The RecordFlux specification language is a domain-specific language to formally specify message formats of existing real-world binary protocols. Its syntax is inspired by [Ada](https://www.adacore.com/about-ada). A detailed description of the language elements can be found in the [Language Reference](/doc/Language-Reference.md).

Message specifications are automatically verified using the [Z3 theorem prover](https://github.com/Z3Prover/z3). The following invariants are proven at the specification level:

* Field conditions are mutually exclusive
* Field conditions do not contradict each other
* Field conditions are not statically false
* Each field is reachable on at least one path from the initial node
* Each field has at least one path to the final node
* Message fields are always located after the first message bit
* Field length is never negative
* Message fields cover all bits of a message on all paths
* Overlaid fields are congruent with exactly one other field

## Code Generation
## SPARK Code Generation

The code generator generates message parsers and generators based on message specifications. The generated parser allows to validate and dissect messages and thereby respects all specified restrictions between message fields and related messages. The generated message generator enables the creation of messages according to the message specification. By using [SPARK](https://www.adacore.com/about-spark) we are able to prove the absence of runtime errors and prevent the incorrect usage of the generated code (e.g., enforce that a field of a message is validated before accessed).
Message parsers and generators are generated based on message specifications. The generated parser allows to validate and dissect messages and thereby respects all specified restrictions between message fields and related messages. The generated message generator enables the creation of messages according to the message specification. By using [SPARK](https://www.adacore.com/about-spark) we are able to prove the absence of runtime errors and prevent the incorrect usage of the generated code (e.g., enforce that a field of a message is validated before accessed).

The code generator creates a number of packages for a specification. All basic types like integers, enumerations and arrays are collectively declared in one package. For each message a child package is generated which contains validation, accessor and setter functions for every field of the message.
Multiple packages are generated for a specification. All basic types like integers, enumerations and arrays are collectively declared in one package. For each message a child package is generated which contains validation, accessor and setter functions for every field of the message.

A user of the generated code has to validate a message field or the whole message before accessing the data of a particular message field. The SPARK verification tools in combination with the generated contracts make it possible to ensure this property, and so prevent incorrect usage.

## Usage
The `rflx` tool is used to verify a specification and generate SPARK code based on it. It offers the sub-commands `check` and `generate` for this purpose. The sub-command `graph` allows to generate images of the graph representations of messages in a specification.

## Python Library

The `rflx` tool is used to verify a specification and generate code based on it. It offers the two sub-commands `check` and `generate` for this purpose.
PyRFLX is a Python library for rapid-prototyping and validation. It uses RecordFlux specifications for parsing and generation of messages and validates the formal specification at runtime. It can be used by importing `rflx.pyrflx`.

By default assertions and contracts are executed to ensure correct functionality. For improved performance these additional checks can be disabled by running Python with the [`-O`](https://docs.python.org/3/using/cmdline.html#cmdoption-o) switch.

## Example

Expand Down Expand Up @@ -79,31 +84,35 @@ With the sub-command `check` the correctness of the given specification file can

```Console
$ rflx check specs/tlv.rflx
Parsing specs/tlv.rflx... OK
Parsing specs/tlv.rflx
Processing TLV
```

The sub-command `generate` is used to generate the code based on the specification. The target directory and the specification files have to be given.

```Console
$ rflx generate -d generated specs/tlv.rflx
Parsing specs/tlv.rflx... OK
Generating... OK
Created generated/rflx-tlv.ads
Created generated/rflx-tlv-generic_message.ads
Created generated/rflx-tlv-generic_message.adb
Created generated/rflx-tlv-message.ads
Created generated/rflx.ads
Created generated/rflx-lemmas.ads
Created generated/rflx-lemmas.adb
Created generated/rflx-types.ads
Created generated/rflx-types.adb
Created generated/rflx-message_sequence.ads
Created generated/rflx-message_sequence.adb
Created generated/rflx-scalar_sequence.ads
Created generated/rflx-scalar_sequence.adb
Parsing specs/tlv.rflx
Processing TLV
Creating generated/rflx-tlv.ads
Creating generated/rflx-tlv-generic_message.ads
Creating generated/rflx-tlv-generic_message.adb
Creating generated/rflx-tlv-message.ads
Creating generated/rflx-rflx_builtin_types-conversions.ads
Creating generated/rflx-rflx_builtin_types.ads
Creating generated/rflx-rflx_generic_types.ads
Creating generated/rflx-rflx_lemmas.ads
Creating generated/rflx-rflx_message_sequence.ads
Creating generated/rflx-rflx_scalar_sequence.ads
Creating generated/rflx-rflx_types.ads
Creating generated/rflx-rflx_generic_types.adb
Creating generated/rflx-rflx_lemmas.adb
Creating generated/rflx-rflx_message_sequence.adb
Creating generated/rflx-rflx_scalar_sequence.adb
Creating generated/rflx.ads
```

### Use of Generated Code
### Using the Generated Code

All scalar types defined in the specification are represented by a similar Ada type in the generated code. For `TLV` the following types are defined in the package `RFLX.TLV`:

Expand All @@ -115,11 +124,11 @@ All types and subprograms related to `Message` can be found in the package `RFLX

- `type Context`
- Stores buffer and internal state
- `procedure Initialize (Ctx : out Context; Buffer : in out RFLX.Types.Bytes_Ptr)`
- `procedure Initialize (Ctx : out Context; Buffer : in out Types.Bytes_Ptr)`
- Initialize context with buffer
- `procedure Initialize (Ctx : out Context; Buffer : in out RFLX.Types.Bytes_Ptr; First, Last : RFLX.Types.Bit_Index_Type)`
- `procedure Initialize (Ctx : out Context; Buffer : in out Types.Bytes_Ptr; First, Last : Types.Bit_Index)`
- Initialize context with buffer and explicit bounds
- `procedure Take_Buffer (Ctx : in out Context; Buffer : out RFLX.Types.Bytes_Ptr)`
- `procedure Take_Buffer (Ctx : in out Context; Buffer : out Types.Bytes_Ptr)`
- Get buffer and remove it from context (note: buffer cannot put back into context, thus further verification of message is not possible after this action)
- `function Has_Buffer (Ctx : Context) return Boolean`
- Check if context contains buffer (i.e. non-null pointer)
Expand All @@ -145,15 +154,15 @@ All types and subprograms related to `Message` can be found in the package `RFLX
- Get value of `Tag` field
- `function Get_Length (Ctx : Context) return Length_Type`
- Get value of `Length` field
- `generic with procedure Process_Value (Value : RFLX.Types.Bytes); procedure Get_Value (Ctx : Context)`
- `generic with procedure Process_Value (Value : Types.Bytes); procedure Get_Value (Ctx : Context)`
- Access content of `Value` field
- `function Valid_Next (Ctx : Context; Fld : Field) return Boolean`
- Check if field is potential next field
- `procedure Set_Tag (Ctx : in out Context; Value : Tag)`
- Set value of `Tag` field
- `procedure Set_Length (Ctx : in out Context; Value : Length)`
- Set value of `Length` field
- `generic with procedure Process_Value (Value : out RFLX.Types.Bytes); procedure Set_Value (Ctx : in out Context)`
- `generic with procedure Process_Value (Value : out Types.Bytes); procedure Set_Value (Ctx : in out Context)`
- Set content of `Value` field
- `procedure Initialize_Value (Ctx : in out Context)`
- Initialize `Value` field (precondition to switch context for generating contained message)
Expand Down Expand Up @@ -228,13 +237,64 @@ begin
end Main;
```

## Dependencies
### Using the Python Library

The following code shows how PyRFLX can be used to parse and generate messages in Python:


```Python
import sys

from rflx.pyrflx import MessageValue, PyRFLX

PYRFLX = PyRFLX(["specs/tlv.rflx"])
TLV = PYRFLX["TLV"]


def parse_message(input_bytes: bytes) -> MessageValue:
msg = TLV["Message"]
msg.parse(input_bytes)
return msg


def create_message() -> MessageValue:
msg = TLV["Message"]
msg.set("Tag", "Msg_Data")
msg.set("Length", 4)
msg.set("Value", b"\x01\x02\x03\x04")
return msg


- [Python >=3.6](https://www.python.org)
if parse_message(b"\x40\x04\x01\x02\x03\x04") != create_message():
sys.exit("Error")
```

## Installation

RecordFlux can be installed from PyPI:

```Console
$ pip3 install RecordFlux
```

By default the following dependencies are installed:

- [icontract](https://github.com/Parquery/icontract)
- [PyParsing](https://github.com/pyparsing/pyparsing/)
- [PyDotPlus](https://github.com/carlos-jenkins/pydotplus)
- [Z3](https://github.com/Z3Prover/z3)
- [GNAT Community 2020](https://www.adacore.com/download)

Additionally, [GNAT Community 2020](https://www.adacore.com/download) is needed for compiling and verifying generated SPARK code.

## Limitations

A list of known limitations for version 0.4.0 can be found [here](https://github.com/Componolit/RecordFlux/issues?q=is%3Aissue+label%3Alimitation+label%3Av0.4.0).

## Background

More information about the theoretical background can be found in our paper:

Reiher T., Senier A., Castrillon J., Strufe T. (2020) RecordFlux: Formal Message Specification and Generation of Verifiable Binary Parsers. In: Arbab F., Jongmans SS. (eds) Formal Aspects of Component Software. FACS 2019. Lecture Notes in Computer Science, vol 12018. Springer, Cham ([paper](https://doi.org/10.1007/978-3-030-40914-2_9), [preprint](https://arxiv.org/abs/1910.02146))

## Licence

Expand Down
2 changes: 1 addition & 1 deletion doc/Language-Reference.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Language Reference

The RecordFlux DSL describes protocol message formats based on types. For each type of the specification language a description of its syntax and semantics and an example is given. A simple variant of Backus-Naur Form is used to describe the syntax. Reserved keywords and literals are marked in bold. The following basic elements are used to describe the syntax of the language:
The specification language describes protocol message formats based on types. For each type of the specification language a description of its syntax and semantics and an example is given. A simple variant of Backus-Naur Form is used to describe the syntax. Reserved keywords and literals are marked in bold. The following basic elements are used to describe the syntax of the language:

*name*: A name consists of alphanumeric characters and underscores. By convention a name starts with a capital and after each underscore follows a capital as well (e.g., Mixed_Case_With_Underscores).

Expand Down
7 changes: 0 additions & 7 deletions doc/Limitations.md

This file was deleted.

41 changes: 36 additions & 5 deletions tools/check_doc.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,10 @@


class CodeBlockType(enum.Enum):
UNKNOWN = 0
RFLX = 1
ADA = 2
UNKNOWN = 3
PYTHON = 3


def check_code_blocks() -> bool:
Expand All @@ -32,6 +33,8 @@ def check_code_blocks() -> bool:
block_type = CodeBlockType.RFLX
elif l.endswith("Ada\n"):
block_type = CodeBlockType.ADA
elif l.endswith("Python\n"):
block_type = CodeBlockType.PYTHON
else:
block_type = CodeBlockType.UNKNOWN
continue
Expand All @@ -45,17 +48,27 @@ def check_code_blocks() -> bool:
block += l

pathlib.Path("build").mkdir(exist_ok=True)
os.symlink("../specs", "build/specs")
os.chdir("build")

for block_type, block in code_blocks:
if block_type is CodeBlockType.RFLX:
valid = check_rflx_code(block) and valid
if block_type is CodeBlockType.ADA:
valid = check_ada_code(block) and valid
valid = check_code(block, block_type) and valid

os.unlink("specs")

return valid


def check_code(block: str, block_type: CodeBlockType) -> bool:
if block_type is CodeBlockType.RFLX:
return check_rflx_code(block)
if block_type is CodeBlockType.ADA:
return check_ada_code(block)
if block_type is CodeBlockType.PYTHON:
return check_python_code(block)
return True


def check_rflx_code(block: str) -> bool:
valid = True

Expand Down Expand Up @@ -89,6 +102,24 @@ def check_ada_code(block: str) -> bool:
return valid


def check_python_code(block: str) -> bool:
valid = True
filename = "test.py"

with open(filename, "w") as f:
f.write(block)

try:
subprocess.run(["python3", filename], check=True)
except subprocess.CalledProcessError:
valid = False
print(f"\naffected code block:\n\n{block}")

os.unlink(filename)

return valid


if __name__ == "__main__":
if not check_code_blocks():
sys.exit("incorrect code block")

0 comments on commit db0a14b

Please sign in to comment.