-
-
Notifications
You must be signed in to change notification settings - Fork 164
Compact AST Representation
andychu edited this page Jan 25, 2020
·
54 revisions
- Reddit Thread: Representing ASTs as byte strings with with small integers rather than pointers
- sajson: Last week, I described a JSON parse tree data structure that, worst case, requires N words for N characters of JSON text. Single allocation JSON?
- paper on on totally recompiling the code to make it a linear pass.
-
Compiling tree transforms to operate on packed representations at ECOOPLDI 2017
- Conference Presentation on YouTube
- This is much more ambitious, but also more limited. Not only are there no pointers, but the tree traversal is equivalent to a linear scan through memory. It compiles a custom programming language and rearranges the operations so this is true.
- See limitations at the end of the paper.
- Good references in that papers:
- The Lattner/Adve paper below
- Cache-Conscious Data Structure Layout
- Compact Normal Form in Glasgow Haskell Compiler -- reduces both space and garbage collection pressure.
- Ken Thompson's postfix encoding for regexes (no pointers) is mentioned here: https://swtch.com/~rsc/regexp/regexp1.html
This is related: more general, but more complex.
- Reddit: Transparent Pointer Compression for Linked Data Structures by Lattner, Adve -- divide address in to 4 GiB pools so that internal pointers are 32-bit.
- Jai Demo: Relative Pointers
- Object-Relative Addressing: Compressed Pointers in 64-bit Java Virtual Machines
- Reddit: Idea for Tiny Relative Pointers. Using the LSB for external vs. internal might be a good idea. I also wanted a bit for const vs. non-const.
-
Vertical object layout and compression for fixed heaps
- Implemented in the Virgil Language by Ben Titzer
- Vertical Object Layout basically means transposing objects and fields. Analogous to arrays of structs -> struct of arrays. Except the "array" is every instance of a given type on the heap.
-
CppCon 2019: Steven Pigeon “Small is beautiful: Techniques to minimise memory footprint” -- a lot of C++ metaprogramming tricks with
constexpr
, etc. Goes to the extreme on size but says little about speed. No benchmarks and no concrete applications, just small pieces of C++. -
Why Sorbet Is Fast -- Fast static type checker for Ruby written in nice C++. 32-bit IDs for interned strings for fast comparison.
GlobalState
andRef
.
- Oil Blog Post: What is oheap?
- Oheap V1 was a read-only format for transporting an ASDL tree across the Python/C++ boundary. Though it turned out there wasn't a single clean boundary in the shell's architecture.
-
OHeap2 was a read-write format, for OVM2.
- At first I used it to represent .pyc files (Python's
CodeObject
) - Meant to replace marshal, pickle, and zipimport.
- 4-16-N model: 4 byte refs, 16 byte cells, and N byte slabs.
- See
ovm2/oheap2.py
- OHeap2 might still be a good idea, but I judged OVM2 not to be a good idea
- At first I used it to represent .pyc files (Python's