Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP / RFC] Cranelift: Basic support for EGraph roundtripping. #4249

Closed
wants to merge 52 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
7768e74
[WIP / RFC] Cranelift: Basic support for EGraph roundtripping.
cfallin Jun 9, 2022
6e6d674
fix tests
cfallin Jun 9, 2022
fb37175
Bugfixes: successfully roundtrips/runs SpiderMonkey and all Wasmtime …
cfallin Jun 9, 2022
80b6e65
Use custom egraph implementation for speed.
cfallin Jul 8, 2022
ff48dcd
Rework `cranelift-egraph` to use more arenas.
cfallin Jul 18, 2022
b064c4d
egraph rebuilding; and get some more perf by pre-allocating with an e…
cfallin Jul 19, 2022
cd8078c
Make `Node` 4 bytes smaller by removing `cap` in the `BumpVec` (`Bump…
cfallin Jul 19, 2022
b6cb9e2
ISLE: implement multi-extractors, necessary for egraph/enode matching.
cfallin Jul 20, 2022
1676bca
Generate ISLE prelude for mid-end opts, and update build infra for se…
cfallin Jul 22, 2022
f562972
Rewrites working!
cfallin Jul 23, 2022
784bc87
Support multi-constructors as well as multi-extractors.
cfallin Jul 27, 2022
3d5abd0
Updates to use multi-constructors.
cfallin Jul 27, 2022
b2721ce
Bugfix in ISLE trie ordering: fallible before infallible etors.
cfallin Jul 27, 2022
8721c34
Implement LICM in egraph lowering.
cfallin Jul 29, 2022
d988b7e
Some basic optimizations (const prop, algebraic identities), and LICM…
cfallin Jul 30, 2022
37734f7
Very basic eq-sat limit: 10 fixpoint iters. Better fuel TODO.
cfallin Jul 30, 2022
63cee8f
Support multi-extractors using an iterator scheme instead.
cfallin Aug 3, 2022
6180b6a
Rework egraph to use a new acyclic scheme with one rewrite pass.
cfallin Aug 3, 2022
e8112d7
Rework CtxHashMap to have Entry interface and use it when building eg…
cfallin Aug 3, 2022
1da6c9b
Alias analysis integration into egraph opts.
cfallin Aug 6, 2022
a1b4ecb
Fix unioning in auxiliary unionfind use by store-to-load addr canonic…
cfallin Aug 6, 2022
8c231e4
Optimizations: cached hashcode in nodes; avoid resizing scoped hashma…
cfallin Aug 6, 2022
b4e8701
egraph: avoid dedup hashcons map for non-pure nodes (they never merge)
cfallin Aug 6, 2022
adf1739
optimization
cfallin Aug 6, 2022
a0d293a
optimize node hashing
cfallin Aug 6, 2022
0156e83
properly optimize readonly/notrap loads
cfallin Aug 6, 2022
633e110
Add back some passes prior to egraph build. DCE is important after le…
cfallin Aug 6, 2022
092723a
avoid visit_block_succs in alias analysis; use precomputed CFG instead
cfallin Aug 6, 2022
e7ae6fb
optimizations: better hashcode caching; constant-time ScopedHashMap p…
cfallin Aug 7, 2022
a8f71c5
Hash/Eq by canonical ids
cfallin Aug 7, 2022
9ec041e
canonicalize in store-to-load match check
cfallin Aug 7, 2022
9be9705
instrument egraph stages with stats
cfallin Aug 7, 2022
39625f1
some more stats
cfallin Aug 7, 2022
e9f709e
Add some algebraic rules to reassociate adds to make LICM more effect…
cfallin Aug 8, 2022
54ce81f
fix licm-reassociate: do not use a no-op bitcast to force hoist
cfallin Aug 8, 2022
f8287d7
Enough from simple_preopt to now exactly match baseline on bz2 on x64
cfallin Aug 8, 2022
a15ddaf
Updated TODO.
cfallin Aug 8, 2022
76abed9
ISLE: support more flexible integer constants. (#4559)
cfallin Jul 29, 2022
fee36b9
optimizations
cfallin Aug 9, 2022
cfc1137
Rematerialization of op-imm and immediates in each basic block where …
cfallin Aug 10, 2022
951e9ff
fix remat!
cfallin Aug 10, 2022
621d61a
fix aarch64
cfallin Aug 11, 2022
ff96ddb
Optimize alias analysis somewhat.
cfallin Aug 11, 2022
61c84d0
node subsumption: cprop etc means no need to keep around more complex…
cfallin Aug 11, 2022
9bf2719
slightly cheaper extraction by memoizing on canonical, not latest, id
cfallin Aug 11, 2022
6b235fc
elaboration/extraction: shortcut for side-effectful nodes
cfallin Aug 11, 2022
a863cfe
Search-tree pruning in extractor; egraph compilation time penalty see…
cfallin Aug 11, 2022
318aa56
get an edge-case right: do not terminate child recursion with zero bo…
cfallin Aug 11, 2022
732ba4e
optimizations to elaboration
cfallin Aug 11, 2022
26a52cb
Inline All The Things
cfallin Aug 11, 2022
c41f206
Minor opt: do not re-match on node children
cfallin Aug 11, 2022
79929c0
Enable use_egraphs by default, to make benchmarking infra easier to use
cfallin Aug 17, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 32 additions & 10 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -140,3 +140,6 @@ harness = false
[[bench]]
name = "call"
harness = false

[profile.release]
debug = true
37 changes: 37 additions & 0 deletions TODO
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
- all simple_preopt

- cprop for floating-point
- strength reduction
- branch folding

- legalization
- replace `op_imm` ops with `op` (legalization)
- heap, table, ...
- show dedup of parts of heap addr computation
- i128 -> i64 narrowing?

- perf ideas
- keep Insts (side-effecting ops) in DFG, and just rewrite args and
attach new results when elaborating.

- code quality
- strength reduction for mul/div/rem
- analyses: demanded bits, defined bits
- more legalizations of pseudoinsts in mid-end, to allow opts
- handle amodes this way too?
- in lowering, don't generate new adds; just take pieces of toplevel add
- then rewrite in mid-end to reassociate as needed
- amodes
- aarch64: better to do reg+imm with add-with-sext for heap base +
wasm ptr
- aarch64 and x64: don't gen new adds during lowering; instead
nudge things in right direction in mid-end and just match up the
tree during lowering
- redundant flags uses: reconsider iflags?

- features
- support non-pure nodes in rewrite as well?
- optimize once before adding to side_effects list
- handle varargs for calls, branches
- abstract the notion of an analysis; use for loop-depth
- use computed loop-depth when elaborating, rather than recomputing
1 change: 1 addition & 0 deletions cranelift/codegen/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ edition = "2021"
cranelift-codegen-shared = { path = "./shared", version = "0.85.0" }
cranelift-entity = { path = "../entity", version = "0.85.0" }
cranelift-bforest = { path = "../bforest", version = "0.85.0" }
cranelift-egraph = { path = "../egraph", version = "0.85.0" }
hashbrown = { version = "0.11", optional = true }
target-lexicon = "0.12"
log = { version = "0.4.6", default-features = false }
Expand Down
32 changes: 28 additions & 4 deletions cranelift/codegen/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -177,9 +177,19 @@ fn get_isle_compilations(
) -> Result<IsleCompilations, std::io::Error> {
let cur_dir = std::env::current_dir()?;

let clif_isle = out_dir.join("clif.isle");
// Preludes.
let clif_lower_isle = out_dir.join("clif_lower.isle");
let clif_opt_isle = out_dir.join("clif_opt.isle");
let prelude_isle =
make_isle_source_path_relative(&cur_dir, crate_dir.join("src").join("prelude.isle"));
let prelude_opt_isle =
make_isle_source_path_relative(&cur_dir, crate_dir.join("src").join("prelude_opt.isle"));
let prelude_lower_isle =
make_isle_source_path_relative(&cur_dir, crate_dir.join("src").join("prelude_lower.isle"));

// Directory for mid-end optimizations.
let src_opts = make_isle_source_path_relative(&cur_dir, crate_dir.join("src").join("opts"));
// Directories for lowering backends.
let src_isa_x64 =
make_isle_source_path_relative(&cur_dir, crate_dir.join("src").join("isa").join("x64"));
let src_isa_aarch64 =
Expand All @@ -202,35 +212,49 @@ fn get_isle_compilations(
// `cranelift/codegen/src/isa/*/lower/isle/generated_code.rs`!
Ok(IsleCompilations {
items: vec![
// The mid-end optimization rules.
IsleCompilation {
output: out_dir.join("isle_opt.rs"),
inputs: vec![
prelude_isle.clone(),
prelude_opt_isle.clone(),
src_opts.join("algebraic.isle"),
src_opts.join("cprop.isle"),
],
untracked_inputs: vec![clif_opt_isle.clone()],
},
// The x86-64 instruction selector.
IsleCompilation {
output: out_dir.join("isle_x64.rs"),
inputs: vec![
prelude_isle.clone(),
prelude_lower_isle.clone(),
src_isa_x64.join("inst.isle"),
src_isa_x64.join("lower.isle"),
],
untracked_inputs: vec![clif_isle.clone()],
untracked_inputs: vec![clif_lower_isle.clone()],
},
// The aarch64 instruction selector.
IsleCompilation {
output: out_dir.join("isle_aarch64.rs"),
inputs: vec![
prelude_isle.clone(),
prelude_lower_isle.clone(),
src_isa_aarch64.join("inst.isle"),
src_isa_aarch64.join("lower.isle"),
],
untracked_inputs: vec![clif_isle.clone()],
untracked_inputs: vec![clif_lower_isle.clone()],
},
// The s390x instruction selector.
IsleCompilation {
output: out_dir.join("isle_s390x.rs"),
inputs: vec![
prelude_isle.clone(),
prelude_lower_isle.clone(),
src_isa_s390x.join("inst.isle"),
src_isa_s390x.join("lower.isle"),
],
untracked_inputs: vec![clif_isle.clone()],
untracked_inputs: vec![clif_lower_isle.clone()],
},
],
})
Expand Down
Loading