-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP / RFC] Cranelift: Basic support for EGraph roundtripping. #4249
Commits on Jun 9, 2022
-
[WIP / RFC] Cranelift: Basic support for EGraph roundtripping.
This is a work-in-progress, and meant to sketch the direction I've been thinking in for a mid-end framework. A proper BA RFC will come soon. This PR builds a phase in the optimization pipeline that converts a CLIF CFG into an egraph representing the function body. Each node represents an original instruction or operator. The "skeleton" of side-effecting instructions is retained, but non-side-effecting (pure) operators are allowed to "float": the egraph will naturally deduplicate them during build, and we will determine their proper place when we convert back to a CFG representation. The conversion from the egraph back to the CFG is done via a new algorithm I call "scoped elaboration". The basic idea is to do a preorder traversal of the domtree, and at each level, evaluate the values of the eclasses called upon by the side-effect skeleton, memoizing in an eclass-to-SSA-value map. This map is a scoped hashmap, with scopes at each domtree level. In this way, (i) when a value is computed in a location that dominates another instance of that value, the first replacees the second; but (ii) we never produce "partially dead" computations, i.e. we never hoist to a level in the domtree where a node is not "anticipated" (always eventually computed). This exactly matches what GVN does today. With a small tweak, it can also subsume LICM: we need to be loop-nest-aware in our recursive eclass elaboration, and potentially place nodes higher up the domtree (and higher up in the scoped hashmap). Unlike what I had been thinking in Monday's meeting, this produces CLIF out of the egraph and then allows that to be lowered. It's overall simpler and a better starting point (thanks @abrown for tipping me over the edge in this). The way it produces CLIF now could be made more efficient: it could reuse instructions already in the DFG for nodes that are *not* duplicated (likely most of them) rather than clearing all and repopulating. This PR does *not* do anything to actually rewrite in the egraph. That's the next step! I need to work out exactly how to integrate ISLE with some sort of rewrite machinery. I have some ideas about efficient dispatch with an "operand-tree discriminants shape analysis" on the egraph and indexing rules by their matched shape; more to come.
Configuration menu - View commit details
-
Copy full SHA for 7768e74 - Browse repository at this point
Copy the full SHA 7768e74View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6e6d674 - Browse repository at this point
Copy the full SHA 6e6d674View commit details -
Configuration menu - View commit details
-
Copy full SHA for fb37175 - Browse repository at this point
Copy the full SHA fb37175View commit details
Commits on Jul 8, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 80b6e65 - Browse repository at this point
Copy the full SHA 80b6e65View commit details
Commits on Jul 18, 2022
-
Rework
cranelift-egraph
to use more arenas.This moves to a strategy based on `hashbrown::raw::RawTable` and "eq-with-context" / "hash-with-context" traits, used to allow nodes to be stored once in a `BumpVec` (which encapsulates a range in a shared `Vec`) and then used in an eclass, with keys in the deduplication hashtable referring to an eclass and enode index in that eclass. This moves back to the enodes-without-IDs strategy used in `egg`, which makes the graph rebuild simpler, but retains the single-storage / no-cloning property. Along the way, carrying through the eq-with-context / hash-with-context to the `Node` itself, we can remove the `&mut [Id]` and `&mut [Type]` slices and replace with `BumpVec`s. Actually they could be made `NonGrowingBumpVec`s for 8 bytes instead of 12; that is future work. This should allow equivalent algorithmic approaches to `egg` but without the separate allocations; all allocations for nodes are now within the entity-component-system-style large `Vec`s.
Configuration menu - View commit details
-
Copy full SHA for ff48dcd - Browse repository at this point
Copy the full SHA ff48dcdView commit details
Commits on Jul 19, 2022
-
egraph rebuilding; and get some more perf by pre-allocating with an e…
…stimated node count
Configuration menu - View commit details
-
Copy full SHA for b064c4d - Browse repository at this point
Copy the full SHA b064c4dView commit details -
Make
Node
4 bytes smaller by removingcap
in theBumpVec
(`Bump……Slice` instead). This brings the overhead of `wasmtime compile spidermonkey.wasm` with `use_egraphs=true` and `opt_level=none` to 4.3% slower than `opt_level=speed`. (This is an interesting comparison because the egraph build/extract roundtrip subsumes GVN, the main time-sink in today's opt pass.)
Configuration menu - View commit details
-
Copy full SHA for cd8078c - Browse repository at this point
Copy the full SHA cd8078cView commit details
Commits on Jul 20, 2022
-
Configuration menu - View commit details
-
Copy full SHA for b6cb9e2 - Browse repository at this point
Copy the full SHA b6cb9e2View commit details
Commits on Jul 22, 2022
-
Generate ISLE prelude for mid-end opts, and update build infra for se…
…parate ISLE environment.
Configuration menu - View commit details
-
Copy full SHA for 1676bca - Browse repository at this point
Copy the full SHA 1676bcaView commit details
Commits on Jul 23, 2022
-
Lots of TODOs still. The main one is how to deal with *multiple* rewrites arising from a single rule invocation on an eclass. I think what needs to happen is that we have multi-ctors as well as multi-etors; these return a `SmallVec<[T; 8]>` or something of the sort. So then the `simplify` toplevel returns a list of eclass IDs it has built that are equivalent to the original. This means that we then need to have two different ABIs for multi-terms: "eager" (internal multi-ctor) and "lazy" (external multi-etor). But this is workable I think. The egraph implementation needs much more stress-testing as well!
Configuration menu - View commit details
-
Copy full SHA for f562972 - Browse repository at this point
Copy the full SHA f562972View commit details
Commits on Jul 27, 2022
-
Support multi-constructors as well as multi-extractors.
This is necessary to allow the `simplify` rule in the mid-end to return multiple Ids of new e-classes that are equivalent to the original. In general, multiplicity is a property of an extractor or constructor (namely, a single match returns multiple results/tuples), not just an external extractor.
Configuration menu - View commit details
-
Copy full SHA for 784bc87 - Browse repository at this point
Copy the full SHA 784bc87View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3d5abd0 - Browse repository at this point
Copy the full SHA 3d5abd0View commit details -
Configuration menu - View commit details
-
Copy full SHA for b2721ce - Browse repository at this point
Copy the full SHA b2721ceView commit details
Commits on Jul 29, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 8721c34 - Browse repository at this point
Copy the full SHA 8721c34View commit details
Commits on Jul 30, 2022
-
Some basic optimizations (const prop, algebraic identities), and LICM…
…-in-egraph-extraction.
Configuration menu - View commit details
-
Copy full SHA for d988b7e - Browse repository at this point
Copy the full SHA d988b7eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 37734f7 - Browse repository at this point
Copy the full SHA 37734f7View commit details
Commits on Aug 3, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 63cee8f - Browse repository at this point
Copy the full SHA 63cee8fView commit details -
Rework egraph to use a new acyclic scheme with one rewrite pass.
More details coming in a separate writeup! This is more-or-less as fast as baseline Cranelift, when compared in the following way: compiling SpiderMonkey, acyclic-egraph plus rewrite rules (cprop and some algebraic identities) with handwritten GVN and LICM disabled (subsumed by egraph elaboration), vs. baseline Cranelift with those opts enabled. The egraph variant is ~1% slower.
Configuration menu - View commit details
-
Copy full SHA for 6180b6a - Browse repository at this point
Copy the full SHA 6180b6aView commit details -
Rework CtxHashMap to have Entry interface and use it when building eg…
…raph to avoid double-lookups.
Configuration menu - View commit details
-
Copy full SHA for e8112d7 - Browse repository at this point
Copy the full SHA e8112d7View commit details
Commits on Aug 6, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 1da6c9b - Browse repository at this point
Copy the full SHA 1da6c9bView commit details -
Configuration menu - View commit details
-
Copy full SHA for a1b4ecb - Browse repository at this point
Copy the full SHA a1b4ecbView commit details -
Optimizations: cached hashcode in nodes; avoid resizing scoped hashma…
…p in elaboration
Configuration menu - View commit details
-
Copy full SHA for 8c231e4 - Browse repository at this point
Copy the full SHA 8c231e4View commit details -
Configuration menu - View commit details
-
Copy full SHA for b4e8701 - Browse repository at this point
Copy the full SHA b4e8701View commit details -
Configuration menu - View commit details
-
Copy full SHA for adf1739 - Browse repository at this point
Copy the full SHA adf1739View commit details -
Configuration menu - View commit details
-
Copy full SHA for a0d293a - Browse repository at this point
Copy the full SHA a0d293aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0156e83 - Browse repository at this point
Copy the full SHA 0156e83View commit details -
Configuration menu - View commit details
-
Copy full SHA for 633e110 - Browse repository at this point
Copy the full SHA 633e110View commit details -
Configuration menu - View commit details
-
Copy full SHA for 092723a - Browse repository at this point
Copy the full SHA 092723aView commit details
Commits on Aug 7, 2022
-
optimizations: better hashcode caching; constant-time ScopedHashMap p…
…ops via generation numbers
Configuration menu - View commit details
-
Copy full SHA for e7ae6fb - Browse repository at this point
Copy the full SHA e7ae6fbView commit details -
Configuration menu - View commit details
-
Copy full SHA for a8f71c5 - Browse repository at this point
Copy the full SHA a8f71c5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9ec041e - Browse repository at this point
Copy the full SHA 9ec041eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9be9705 - Browse repository at this point
Copy the full SHA 9be9705View commit details -
Configuration menu - View commit details
-
Copy full SHA for 39625f1 - Browse repository at this point
Copy the full SHA 39625f1View commit details
Commits on Aug 8, 2022
-
Add some algebraic rules to reassociate adds to make LICM more effect…
…ive; working on improving amode lowering.
Configuration menu - View commit details
-
Copy full SHA for e9f709e - Browse repository at this point
Copy the full SHA e9f709eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 54ce81f - Browse repository at this point
Copy the full SHA 54ce81fView commit details -
Configuration menu - View commit details
-
Copy full SHA for f8287d7 - Browse repository at this point
Copy the full SHA f8287d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for a15ddaf - Browse repository at this point
Copy the full SHA a15ddafView commit details
Commits on Aug 9, 2022
-
ISLE: support more flexible integer constants. (bytecodealliance#4559)
The ISLE language's lexer previously used a very primitive `i64::from_str_radix` call to parse integer constants, allowing values in the range -2^63..2^63 only. Also, underscores to separate digits (as is allwoed in Rust) were not supported. Finally, 128-bit constants were not supported at all. This PR addresses all issues above: - Integer constants are internally stored as 128-bit values. - Parsing supports either signed (-2^127..2^127) or unsigned (0..2^128) range. Negation works independently of that, so one can write `-0xffff..ffff` (128 bits wide, i.e., -(2^128-1)) to get a `1`. - Underscores are supported to separate groups of digits, so one can write `0xffff_ffff`. - A minor oversight was fixed: hex constants can start with `0X` (uppercase) as well as `0x`, for consistency with Rust and C. This PR also adds a new kind of ISLE test that actually runs a driver linked to compiled ISLE code; we previously didn't have any such tests, but it is now quite useful to assert correct interpretation of constant values.
Configuration menu - View commit details
-
Copy full SHA for 76abed9 - Browse repository at this point
Copy the full SHA 76abed9View commit details -
Configuration menu - View commit details
-
Copy full SHA for fee36b9 - Browse repository at this point
Copy the full SHA fee36b9View commit details
Commits on Aug 10, 2022
-
Configuration menu - View commit details
-
Copy full SHA for cfc1137 - Browse repository at this point
Copy the full SHA cfc1137View commit details -
Configuration menu - View commit details
-
Copy full SHA for 951e9ff - Browse repository at this point
Copy the full SHA 951e9ffView commit details
Commits on Aug 11, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 621d61a - Browse repository at this point
Copy the full SHA 621d61aView commit details -
Configuration menu - View commit details
-
Copy full SHA for ff96ddb - Browse repository at this point
Copy the full SHA ff96ddbView commit details -
node subsumption: cprop etc means no need to keep around more complex…
… node in a union
Configuration menu - View commit details
-
Copy full SHA for 61c84d0 - Browse repository at this point
Copy the full SHA 61c84d0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9bf2719 - Browse repository at this point
Copy the full SHA 9bf2719View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b235fc - Browse repository at this point
Copy the full SHA 6b235fcView commit details -
Search-tree pruning in extractor; egraph compilation time penalty see…
…ms to be down to 1%-ish or so again
Configuration menu - View commit details
-
Copy full SHA for a863cfe - Browse repository at this point
Copy the full SHA a863cfeView commit details -
get an edge-case right: do not terminate child recursion with zero bo…
…und remaining because some nodes cost zero points
Configuration menu - View commit details
-
Copy full SHA for 318aa56 - Browse repository at this point
Copy the full SHA 318aa56View commit details -
Configuration menu - View commit details
-
Copy full SHA for 732ba4e - Browse repository at this point
Copy the full SHA 732ba4eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 26a52cb - Browse repository at this point
Copy the full SHA 26a52cbView commit details -
Configuration menu - View commit details
-
Copy full SHA for c41f206 - Browse repository at this point
Copy the full SHA c41f206View commit details
Commits on Aug 17, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 79929c0 - Browse repository at this point
Copy the full SHA 79929c0View commit details