feat(mem2reg): Remove trivial stores #5865

vezenovm · 2024-08-29T21:06:27Z

Description

Problem*

Partially resolves #4535. We can still do some more cleanup but will handle that in a follow-up. References the PR discussion comments for more context.

Summary*

Just marking the result of a load known was causing some failures, specifically for arrays. So for now I just look for trivial stores that are immediately storing the same value that was just loaded.

Also testing on CI as I keep getting rayon errors with cargo test for some reason. Making this check mark more values known was also difficult to due to this issue with testing so I am just pushing the more trivial case in this draft.

Additional Context

Documentation*

Check one:

No documentation needed.
Documentation included in this PR.
[For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

I have tested the changes locally.
I have formatted the changes with Prettier and/or cargo fmt on default settings.

github-actions · 2024-08-29T21:09:50Z

Changes to Brillig bytecode sizes

Generated at commit: 26f990809064817b8d19adc57e83c38ee80620cd, compared to commit: 1737b656c861706c38b59bd5ef6cd095687a2898

🧾 Summary (10% most significant diffs)

Program	Brillig opcodes (+/-)	%
no_predicates_numeric_generic_poseidon	-196 ✅	-14.79%
fold_numeric_generic_poseidon	-196 ✅	-14.79%
poseidon2	-92 ✅	-21.70%

Full diff report 👇

Program	Brillig opcodes (+/-)	%
nested_array_dynamic	4,206 (-4)	-0.10%
eddsa	67,414 (-104)	-0.15%
fold_complex_outputs	1,020 (-2)	-0.20%
nested_array_in_slice	1,676 (-4)	-0.24%
regression_5252	36,277 (-104)	-0.29%
slice_regex	7,283 (-25)	-0.34%
regression	728 (-9)	-1.22%
uhashmap	24,307 (-2,647)	-9.82%
brillig_loop_size_regression	55 (-7)	-11.29%
hashmap	36,942 (-5,846)	-13.66%
no_predicates_numeric_generic_poseidon	1,129 (-196)	-14.79%
fold_numeric_generic_poseidon	1,129 (-196)	-14.79%
poseidon2	332 (-92)	-21.70%

github-actions · 2024-08-29T21:13:03Z

Changes to circuit sizes

Generated at commit: 26f990809064817b8d19adc57e83c38ee80620cd, compared to commit: 1737b656c861706c38b59bd5ef6cd095687a2898

🧾 Summary (10% most significant diffs)

Program	ACIR opcodes (+/-)	%	Circuit size (+/-)	%
nested_array_in_slice	-36 ✅	-3.29%	-36 ✅	-0.64%
hashmap	-3,109 ✅	-3.18%	-3,646 ✅	-2.32%

Full diff report 👇

Program	ACIR opcodes (+/-)	%	Circuit size (+/-)	%
nested_array_in_slice	1,057 (-36)	-3.29%	5,596 (-36)	-0.64%
hashmap	94,642 (-3,109)	-3.18%	153,248 (-3,646)	-2.32%

jfecher

Some potential aliasing issues

compiler/noirc_evaluator/src/ssa/opt/mem2reg/block.rs

compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs

vezenovm · 2024-08-30T16:38:05Z

I'm kind of surprised by the brillig_cow_assign failure looking at the SSA:

brillig fn main f0 {
  b0():
    inc_rc [Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0]
    v2 = allocate
    store [Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0] at v2
    v3 = load v2
    inc_rc v3
    v4 = load v2
    inc_rc v4
    store v4 at v2
    v5 = allocate
    store v3 at v5
    jmp b1(u32 0)
  b1(v6: u32):
    v9 = lt v6, u32 10
    jmpif v9 then: b2, else: b3
  b2():
    v21 = eq v6, u32 5
    jmpif v21 then: b4, else: b5
  b4():
    v27 = load v2
    inc_rc v27
    v28 = load v2
    inc_rc v28
    store v28 at v2
    store v27 at v5
    jmp b5()
  b5():
    v22 = load v2
    v23 = array_set v22, index v6, value Field 27
    v25 = add v6, u32 1
    store v23 at v2
    v26 = add v6, u32 1
    jmp b1(v26)
  b3():
    v10 = load v2
    v12 = array_get v10, index u32 6
    v14 = eq v12, Field 27
    constrain v12 == Field 27
    v15 = load v5
    v16 = array_get v15, index u32 6
    v17 = eq v16, Field 27
    v18 = not v17
    constrain v17 == u1 0
    return 
}

After Mem2Reg:
brillig fn main f0 {
  b0():
    inc_rc [Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0]
    v30 = allocate
    inc_rc [Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0]
    inc_rc [Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0]
    store [Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0] at v30
    v33 = allocate
    store [Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0, Field 0] at v33
    jmp b1(u32 0)
  b1(v6: u32):
    v34 = lt v6, u32 10
    jmpif v34 then: b2, else: b3
  b2():
    v42 = eq v6, u32 5
    jmpif v42 then: b4, else: b5
  b4():
    v43 = load v30
    inc_rc v43
    v44 = load v30
    inc_rc v44
    store v43 at v33
    jmp b5()
  b5():
    v45 = load v30
    v46 = array_set v45, index v6, value Field 27
    v47 = add v6, u32 1
    store v46 at v30
    v48 = add v6, u32 1
    jmp b1(v48)
  b3():
    v35 = load v30
    v36 = array_get v35, index u32 6
    v37 = eq v36, Field 27
    constrain v36 == Field 27
    v38 = load v33
    v39 = array_get v38, index u32 6
    v40 = eq v39, Field 27
    v41 = not v40
    constrain v40 == u1 0
    return 
}

When comparing to the SSA without this optimization the only difference is there is an additional store v44 at v30 before store v43 at v33 in b4. I assume this is due Brillig's CoW optimization and how inc_rc is processed, although the SSA looks like it should execute correctly.

Edit: Found this hack (

noir/compiler/noirc_evaluator/src/ssa/function_builder/mod.rs

Line 476 in 5c4f19f

// We can't re-use `value` in case the original address was stored

). This PR is removing that re-store.

vezenovm · 2024-08-30T19:29:24Z

Accounting the extra re-load and re-store for inc_rc instructions fixes the brillig_cow_assign bug.

Interestingly, uhashmap is the last failure then and it only occurs with the additional mem2reg and DIE passes. Those passes were added to cleanup leftover unused stores after the loads referencing that store are removed during DEI.
e.g. without those two passes brillig_loop_size_regression gives the following SSA:

After Array Set Optimizations:
brillig fn main f0 {
  b0():
    v50 = allocate
    store u1 1 at v50, from_rc false
    v51 = allocate
    store Field 1 at v51, from_rc false
    v52 = allocate
    store u1 0 at v52, from_rc false
    v53 = allocate
    store Field 0 at v53, from_rc false
    v54 = allocate
    store u1 0 at v54, from_rc false
    v55 = allocate
    store Field 0 at v55, from_rc false
    jmp b1(u32 0)
  b1(v10: u32):
    v56 = eq v10, u32 0
    jmpif v56 then: b2, else: b3
  b2():
    v63 = load v50
    v64 = load v51
    constrain v63 == u1 1
    constrain v64 == Field 1
    v70 = add v10, u32 1
    jmp b1(v70)
  b3():
    store u1 1 at v50, from_rc false
    store Field 2 at v51, from_rc false
    return Field 2
}

Then with those passes we get the following:

After Array Set Optimizations:
brillig fn main f0 {
  b0():
    v71 = allocate
    store u1 1 at v71, from_rc false
    v72 = allocate
    store Field 1 at v72, from_rc false
    jmp b1(u32 0)
  b1(v10: u32):
    v77 = eq v10, u32 0
    jmpif v77 then: b2, else: b3
  b2():
    v78 = load v71
    v79 = load v72
    constrain v78 == u1 1
    constrain v79 == Field 1
    v80 = add v10, u32 1
    jmp b1(v80)
  b3():
    store u1 1 at v71, from_rc false
    store Field 2 at v72, from_rc false
    return Field 2
}

brillig_loop_size_regression 65 (-8) -10.96%

We still get a reduction without these extra mem2reg and DIE passes, but we should try to remove these instructions as we know we can. We just have to figure out why it is causing issues for uhashmap. We could merge these changes as they still provide a nice reduction in poseidon2, hashmap, and uhashmap. Then in a follow-up we can address getting rid of these leftover stores.

jfecher · 2024-08-30T19:57:27Z

Interestingly, uhashmap is the last failure then and it only occurs with the additional mem2reg and DIE passes. Those passes were added to cleanup leftover unused stores after the loads referencing that store are removed during DEI.

Ideally we get rid of those extra passes anyway. It's a fairly unsatisfactory tradeoff with compilation time IMO to add two extra passes to get rid of a couple extra instructions in some brillig functions. More concerning is why running these again (although the culprit is presumably mem2reg) produces an issue in the first place.

vezenovm · 2024-09-03T13:46:24Z

Ideally we get rid of those extra passes anyway. It's a fairly unsatisfactory tradeoff with compilation time IMO to add two extra passes to get rid of a couple extra instructions in some brillig functions.

Yeah agreed. I'm going to mark this PR ready for review without those extra passes this PR still shows a good improvement.

More concerning is why running these again (although the culprit is presumably mem2reg) produces an issue in the first place.

Then in a follow-up we can investigate the cause of uhashmap failing.

…to mv/simplify-immediate-stores

jfecher · 2024-09-03T17:53:52Z

compiler/noirc_evaluator/src/ssa/ir/printer.rs

+        Instruction::Store { address, value, from_rc } => {
+            writeln!(f, "store {} at {}, from_rc {}", show(*value), show(*address), *from_rc)


I'm not a big fan of including from_rc on every store instruction for a mem2reg-specific issue.

Can we add tracking to mem2reg specifically instead? E.g. if we see a load -> dec-rc -> store we mark that we can't remove the store?

I'll look at switching to that

I switched to tracking the current rc reload per instruction with an Option<(ValueId, bool)>.

Ah something looks to be failing. I had it passing before but I guess I made a bad change while cleaning up.

I had moved when I was calling my method to track the rc reload state, but this led to inadvertently calling the method on the wrong instruction id. This is now fixed and the PR is ready for review again.

Looks to still be failing on the debugger actually as the debugger inserts foreign calls which are breaking up the expected block of this form:

v72 = load v57 inc_rc v72 v73 = load v57 inc_rc v73 store v73 at v57 store v72 at v60

In the debugger we see the following before mem2reg after the inlining pass:

v76 = load v14 inc_rc v76 v77 = load v14 inc_rc v77 store v77 at v14 call v80(Field 1, v76) inc_rc v76 v81 = load v14 inc_rc v81 store v81 at v14 store v76 at v26

jfecher · 2024-09-04T17:00:45Z

@vezenovm if it helps for this PR I was talking with @sirasistant who mentioned changing the design of how inc/dec-rc works in brillig. A side-effect of that work would be that we no longer would have to store after we load and dec-rc arrays at the end of each function. Should make things easier here I imagine?

vezenovm · 2024-09-04T17:08:49Z

A side-effect of that work would be that we no longer would have to store after we load and dec-rc arrays at the end of each function. Should make things easier here I imagine?

Yeah it should. The main issue here is I have to differentiate which stores I can actually remove if the known value of a store equals the address it is storing into. Without inc_rc/dec_rc, I can safely remove any of these stores. It does look to work with the solution on this branch. Although it is a bit hacked around as I have uhashmap failures on #5905 which I believe are happening to due these stores being inadvertently deleted.

jfecher · 2024-09-04T17:39:16Z

We can pause this work for a week or so then while the brillig changes are being worked on. Hopefully with the removal of the special tracking uhashmap won't fail anymore.

vezenovm · 2024-09-04T18:15:11Z

We can pause this work for a week or so then while the brillig changes are being worked on. Hopefully with the removal of the special tracking uhashmap won't fail anymore.

I haven't nailed down the exact cause but I think uhashmap might be failing due to a separate reason. I want to nail down exactly why it is failing. If it is due to inc_rc/dec_rc I'll wait for the brillig changes, otherwise I will try to resolve it.

vezenovm · 2024-09-04T18:38:04Z

If it is due to inc_rc/dec_rc I'll wait for the brillig changes, otherwise I will try to resolve it.

Ok I have nailed down the cause of the failure and it is in fact unrelated to inc_rc/dec_rc.

When cleaing up stores we check a couple things:

That there is not a load of that store.
That we do not have a reference param.

Now that I am cleaning up loads as well (#5905) we can remove some stores if we know all loads to that store have been removed. However, I was not checking whether the address of the store we want to remove is possibly used as a reference directly such as in the parameter of a call. uhashmap is now passing for me locally on #5905 when checking whether the last store we want to remove is used in a call.

vezenovm · 2024-09-04T18:41:32Z

@jfecher I am going to pull out the changes from #5905 as I had built it off of this PR and the uhashmap failure is unrelated to the inc_rc/dec_rc edge case. I will pause working on this PR (but leave it open) until the design of inc_rc/dec_rc is updated in brillig.

vezenovm · 2024-09-05T13:56:14Z

@jfecher I was thinking we could replace this PR with (#5935) which handles inc_rc / dec_rc in the safest manner by just always assuming when we have an inc_rc/dec_rc before a store we cannot remove that store. In this PR I was attempting to check for specifically this case but that was leading to issues. PR #5935 just accepts a smaller improvement (21% on this PR -> 7% on the new one) for safety. And then when inc_rc/dec_rc are redesigned we can remove the small check.

TomAFrench · 2024-09-05T15:15:43Z

Changing to draft for clarity.

vezenovm · 2024-09-05T15:30:10Z

Closing in favor of #5935

… mem2reg (#5935) # Description ## Problem\* Partially resolves #4535 Replaces #5865 ## Summary\* When we see a load we mark the address of that load as being a known value of the load result. When we reach a store instuction, if that store value has a known value which is equal to the address of the store we can remove that store. We also check whether the last instruction was an `inc_rc` or a `dec_rc`. If it was we do not remove the store. ## Additional Context ## Documentation\* Check one: - [X] No documentation needed. - [ ] Documentation included in this PR. - [ ] **[For Experimental Features]** Documentation to be submitted in a separate PR. # PR Checklist\* - [X] I have tested the changes locally. - [X] I have formatted the changes with [Prettier](https://prettier.io/) and/or `cargo fmt` on default settings.

vezenovm added 2 commits August 29, 2024 21:04

remove trivial stores

e0ad0a9

fmt

5bbaca8

jfecher requested changes Aug 29, 2024

View reviewed changes

compiler/noirc_evaluator/src/ssa/opt/mem2reg/block.rs Outdated Show resolved Hide resolved

compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs Outdated Show resolved Hide resolved

vezenovm added 2 commits August 29, 2024 21:37

don't include arrays in last store

a66009a

switch to using get_known_value and set_known_value

ce31cb0

vezenovm mentioned this pull request Aug 30, 2024

SSA optimization is ineffective when an unconstrained function has loops #4535

Closed

vezenovm added 5 commits August 30, 2024 16:52

add from_rc flag to store

45db90b

cargo fmt

96b70bd

Merge branch 'master' into mv/simplify-immediate-stores

6a0f8c3

Merge branch 'master' into mv/simplify-immediate-stores

a595349

don't call mem2reg and die again to fix uhashmap

82bc2ea

vezenovm added 5 commits August 30, 2024 19:33

cargo fmt

6ce03eb

comment out extra mem2reg and die

5c41302

remove last_loads approach and add some comments

568c083

cleanup

c1c2b61

Merge branch 'master' into mv/simplify-immediate-stores

6222963

Merge branch 'master' into mv/simplify-immediate-stores

6ead23d

vezenovm mentioned this pull request Sep 3, 2024

SSA: Extra mem2reg pass causing failures for UHashMap #5897

Closed

vezenovm added 3 commits September 3, 2024 14:34

cleanup

dae5a44

Merge remote-tracking branch 'origin/mv/simplify-immediate-stores' in…

e920768

…to mv/simplify-immediate-stores

reduce diff

4962db1

vezenovm marked this pull request as ready for review September 3, 2024 14:57

vezenovm requested a review from jfecher September 3, 2024 14:57

missed push

31224eb

vezenovm added 2 commits September 3, 2024 16:18

cleanup

adacab4

Merge branch 'master' into mv/simplify-immediate-stores

ea39ead

vezenovm requested a review from a team September 3, 2024 16:29

vezenovm mentioned this pull request Sep 3, 2024

feat(perf): Remove unused last loads in mem2reg #5905

Closed

5 tasks

jfecher reviewed Sep 3, 2024

View reviewed changes

remove from_rc from Store and track in mem2reg

20d20af

vezenovm requested a review from jfecher September 3, 2024 21:55

vezenovm added 5 commits September 3, 2024 21:57

remove unnecessary

2d8f5e6

move rc reload state tracking

90a4fbe

improve handling of rc reload state and handle call when tracking

90d5715

Merge branch 'master' into mv/simplify-immediate-stores

96ea083

master merge

0db5610

This was referenced Sep 4, 2024

feat(perf): Remove unused loads in mem2reg and last stores per function #5925

Merged

feat(perf): Remove stores with known values #5934

Closed

feat(perf): Remove known store values that equal the store address in mem2reg #5935

Merged

TomAFrench marked this pull request as draft September 5, 2024 15:15

vezenovm closed this Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mem2reg): Remove trivial stores #5865

feat(mem2reg): Remove trivial stores #5865

vezenovm commented Aug 29, 2024 •

edited

Loading

github-actions bot commented Aug 29, 2024 •

edited

Loading

github-actions bot commented Aug 29, 2024 •

edited

Loading

jfecher left a comment

vezenovm commented Aug 30, 2024 •

edited

Loading

vezenovm commented Aug 30, 2024 •

edited

Loading

jfecher commented Aug 30, 2024

vezenovm commented Sep 3, 2024 •

edited

Loading

jfecher Sep 3, 2024

vezenovm Sep 3, 2024

vezenovm Sep 3, 2024

vezenovm Sep 3, 2024

vezenovm Sep 3, 2024

vezenovm Sep 3, 2024 •

edited

Loading

jfecher commented Sep 4, 2024

vezenovm commented Sep 4, 2024

jfecher commented Sep 4, 2024

vezenovm commented Sep 4, 2024

vezenovm commented Sep 4, 2024

vezenovm commented Sep 4, 2024

vezenovm commented Sep 5, 2024

TomAFrench commented Sep 5, 2024

vezenovm commented Sep 5, 2024

		Instruction::Store { address, value, from_rc } => {
		writeln!(f, "store {} at {}, from_rc {}", show(value), show(address), *from_rc)

feat(mem2reg): Remove trivial stores #5865

feat(mem2reg): Remove trivial stores #5865

Conversation

vezenovm commented Aug 29, 2024 • edited Loading

Description

Problem*

Summary*

Additional Context

Documentation*

PR Checklist*

github-actions bot commented Aug 29, 2024 • edited Loading

Changes to Brillig bytecode sizes

🧾 Summary (10% most significant diffs)

github-actions bot commented Aug 29, 2024 • edited Loading

Changes to circuit sizes

🧾 Summary (10% most significant diffs)

jfecher left a comment

Choose a reason for hiding this comment

vezenovm commented Aug 30, 2024 • edited Loading

vezenovm commented Aug 30, 2024 • edited Loading

jfecher commented Aug 30, 2024

vezenovm commented Sep 3, 2024 • edited Loading

jfecher Sep 3, 2024

Choose a reason for hiding this comment

vezenovm Sep 3, 2024

Choose a reason for hiding this comment

vezenovm Sep 3, 2024

Choose a reason for hiding this comment

vezenovm Sep 3, 2024

Choose a reason for hiding this comment

vezenovm Sep 3, 2024

Choose a reason for hiding this comment

vezenovm Sep 3, 2024 • edited Loading

Choose a reason for hiding this comment

jfecher commented Sep 4, 2024

vezenovm commented Sep 4, 2024

jfecher commented Sep 4, 2024

vezenovm commented Sep 4, 2024

vezenovm commented Sep 4, 2024

vezenovm commented Sep 4, 2024

vezenovm commented Sep 5, 2024

TomAFrench commented Sep 5, 2024

vezenovm commented Sep 5, 2024

vezenovm commented Aug 29, 2024 •

edited

Loading

github-actions bot commented Aug 29, 2024 •

edited

Loading

github-actions bot commented Aug 29, 2024 •

edited

Loading

vezenovm commented Aug 30, 2024 •

edited

Loading

vezenovm commented Aug 30, 2024 •

edited

Loading

vezenovm commented Sep 3, 2024 •

edited

Loading

vezenovm Sep 3, 2024 •

edited

Loading