Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(perf): Track last loads per block in mem2reg and remove them if possible #6088

Merged
merged 40 commits into from
Nov 26, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
90712c2
handle last_loads per block and remove them if possible
vezenovm Sep 18, 2024
8bf6ff6
remove debugging ssa file
vezenovm Sep 18, 2024
12cd64e
cleanup comments
vezenovm Sep 18, 2024
8c077aa
remove test debugging things
vezenovm Sep 18, 2024
5b88acd
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Sep 18, 2024
ccd7298
actually removei nstruction do not just map value
vezenovm Sep 18, 2024
65a461f
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Sep 19, 2024
2fccdbc
Update compiler/noirc_evaluator/src/ssa/opt/mem2reg/block.rs
vezenovm Sep 19, 2024
26f4041
Update compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs
vezenovm Sep 19, 2024
18b35e5
Update compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs
vezenovm Sep 19, 2024
c275357
Update compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs
vezenovm Sep 19, 2024
171108f
fmt
vezenovm Sep 19, 2024
360654e
cleanup
vezenovm Sep 19, 2024
0bf0525
one more cleanup
vezenovm Sep 19, 2024
6ef6546
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Sep 19, 2024
db6f54a
Update compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs
vezenovm Sep 19, 2024
82bb286
clippy in test
vezenovm Sep 19, 2024
3c45af5
improve keep_repeat_loads_with_alias_store test and track parameters …
vezenovm Sep 20, 2024
c85c953
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Sep 20, 2024
8c655e2
Update compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs
vezenovm Sep 23, 2024
1327c1a
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Sep 23, 2024
02ed30e
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Sep 24, 2024
b1d8305
timeout increase
vezenovm Sep 24, 2024
7c3b927
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Sep 26, 2024
88afd21
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Sep 27, 2024
3d6a249
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Sep 30, 2024
2d6c0fa
Merge branch 'master' into mv/remove-last-loads-per-block
TomAFrench Oct 2, 2024
cba3738
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Oct 3, 2024
f66b2c6
merge conflicts w/ master
vezenovm Nov 22, 2024
cb0dd20
update tests to use SSA parser
vezenovm Nov 22, 2024
9ac843a
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Nov 22, 2024
65f8fea
use or_default()
vezenovm Nov 22, 2024
c8bb5cc
Update compiler/noirc_evaluator/src/ssa/opt/mem2reg/alias_set.rs
vezenovm Nov 22, 2024
7067aff
Merge remote-tracking branch 'origin/mv/remove-last-loads-per-block' …
vezenovm Nov 22, 2024
4ef0204
cleaup comments
vezenovm Nov 22, 2024
ffafdcc
Update compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs
vezenovm Nov 25, 2024
ac925cb
move last loads clearing
vezenovm Nov 25, 2024
40396f5
pr comments and cleanup
vezenovm Nov 25, 2024
e18c6a6
Merge branch 'master' into mv/remove-last-loads-per-block
vezenovm Nov 25, 2024
ce1b777
merge conflicts w/ master
vezenovm Nov 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
270 changes: 268 additions & 2 deletions compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
//! - A reference with 0 aliases means we were unable to find which reference this reference
//! refers to. If such a reference is stored to, we must conservatively invalidate every
//! reference in the current block.
//! - We also track the last load instruction to each address per block.
//!
//! From there, to figure out the value of each reference at the end of block, iterate each instruction:
//! - On `Instruction::Allocate`:
Expand All @@ -28,6 +29,13 @@
//! - Furthermore, if the result of the load is a reference, mark the result as an alias
//! of the reference it dereferences to (if known).
//! - If which reference it dereferences to is not known, this load result has no aliases.
//! - We also track the last instance of a load instruction to each address in a block.
//! If we see that the last load instruction was from the same address as the current load instruction,
//! we move to replace the result of the current load with the result of the previous load.
//! This removal requires a couple conditions:
//! - No store occurs to that address before the next load,
//! - The address is not used as an argument to a call
//! This optimization helps us remove repeated loads for which there are not known values.
//! - On `Instruction::Store { address, value }`:
//! - If the address of the store is known:
//! - If the address has exactly 1 alias:
Expand All @@ -40,11 +48,13 @@
//! - Conservatively mark every alias in the block to `Unknown`.
//! - Additionally, if there were no Loads to any alias of the address between this Store and
//! the previous Store to the same address, the previous store can be removed.
//! - Remove the instance of the last load instruction to the address and its aliases
//! - On `Instruction::Call { arguments }`:
//! - If any argument of the call is a reference, set the value of each alias of that
//! reference to `Unknown`
//! - Any builtin functions that may return aliases if their input also contains a
//! reference should be tracked. Examples: `slice_push_back`, `slice_insert`, `slice_remove`, etc.
//! - Remove the instance of the last load instruction for any reference arguments and their aliases
//!
//! On a terminator instruction:
//! - If the terminator is a `Jmp`:
Expand All @@ -56,7 +66,7 @@
//!
//! Repeating this algorithm for each block in the function in program order should result in
//! optimizing out most known loads. However, identifying all aliases correctly has been proven
//! undecidable in general (Landi, 1992). So this pass will not always optimize out all loads

Check warning on line 69 in compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs

View workflow job for this annotation

GitHub Actions / Code

Unknown word (Landi)
//! that could theoretically be optimized out. This pass can be performed at any time in the
//! SSA optimization pipeline, although it will be more successful the simpler the program's CFG is.
//! This pass is currently performed several times to enable other passes - most notably being
Expand Down Expand Up @@ -144,7 +154,7 @@
/// Load and Store instructions that should be removed at the end of the pass.
///
/// We avoid removing individual instructions as we go since removing elements
/// from the middle of Vecs many times will be slower than a single call to `retain`.

Check warning on line 157 in compiler/noirc_evaluator/src/ssa/opt/mem2reg.rs

View workflow job for this annotation

GitHub Actions / Code

Unknown word (Vecs)
instructions_to_remove: HashSet<InstructionId>,

/// Track a value's last load across all blocks.
Expand Down Expand Up @@ -277,6 +287,10 @@
self.remove_stores_that_do_not_alias_parameters(&references);
}

// Last loads are truly "per block". During unification we are creating a new block from the current one,
// so we must clear the last loads of the current block before return the new block.
vezenovm marked this conversation as resolved.
Show resolved Hide resolved
references.last_loads.clear();

self.blocks.insert(block, references);
}

Expand Down Expand Up @@ -390,6 +404,28 @@
.insert((address, block_id), instruction_index);
}
}

// Check whether the block has a repeat load from the same address (w/ no calls or stores in between the loads).
// If we do have a repeat load, we can remove the current load and map its result to the previous loads result.
if let Some(last_load) = references.last_loads.get(&address) {
jfecher marked this conversation as resolved.
Show resolved Hide resolved
vezenovm marked this conversation as resolved.
Show resolved Hide resolved
let Instruction::Load { address: previous_address } =
&self.inserter.function.dfg[*last_load]
else {
panic!("Expected a Load instruction here");
};
let result = self.inserter.function.dfg.instruction_results(instruction)[0];
let previous_result =
self.inserter.function.dfg.instruction_results(*last_load)[0];
if *previous_address == address {
self.inserter.map_value(result, previous_result);
self.instructions_to_remove.insert(instruction);
}
}
// We want to set the load for every load even if the address has a known value
// and the previous load instruction was removed.
// We are safe to still remove a repeat load in this case as we are mapping from the current load's
// result to the previous load, which if it was removed should already have a mapping to the known value.
references.set_last_load(address, instruction);
vezenovm marked this conversation as resolved.
Show resolved Hide resolved
}
Instruction::Store { address, value } => {
let address = self.inserter.function.dfg.resolve(*address);
Expand Down Expand Up @@ -446,6 +482,8 @@
}

references.set_known_value(address, value);
// If we see a store to an address, the last load to that address needs to remain.
references.keep_last_load_for(address, self.inserter.function);
}
Instruction::Allocate => {
// Register the new reference
Expand Down Expand Up @@ -552,6 +590,9 @@
let value = self.inserter.function.dfg.resolve(*value);
references.set_unknown(value);
references.mark_value_used(value, self.inserter.function);

// If a reference is an argument to a call, the last load to that address and its aliases needs to remain.
references.keep_last_load_for(value, self.inserter.function);
}
}
}
Expand Down Expand Up @@ -921,6 +962,7 @@

use acvm::{acir::AcirField, FieldElement};
use im::vector;
use noirc_frontend::monomorphization::ast::InlineType;

use crate::ssa::{
function_builder::FunctionBuilder,
Expand Down Expand Up @@ -1404,7 +1446,7 @@
// v10 = eq v9, Field 2
// constrain v9 == Field 2
// v11 = load v2
// v12 = load v10
// v12 = load v11
// v13 = eq v12, Field 2
// constrain v11 == Field 2
// return
Expand Down Expand Up @@ -1463,7 +1505,7 @@
let main = ssa.main();
assert_eq!(main.reachable_blocks().len(), 4);

// The store from the original SSA should remain
// The stores from the original SSA should remain
assert_eq!(count_stores(main.entry_block(), &main.dfg), 2);
assert_eq!(count_stores(b2, &main.dfg), 1);

Expand Down Expand Up @@ -1706,4 +1748,228 @@
assert_eq!(count_loads(b2, &main.dfg), 1);
assert_eq!(count_loads(b3, &main.dfg), 5);
}

#[test]
fn remove_repeat_loads() {
TomAFrench marked this conversation as resolved.
Show resolved Hide resolved
// This tests starts with two loads from the same unknown load.
// Specifically you should look for `load v2` in `b3`.
// We should be able to remove the second repeated load.
//
// acir(inline) fn main f0 {
// b0():
// v0 = allocate
// store Field 0 at v0
// v2 = allocate
// store v0 at v2
// jmp b1(Field 0)
// b1(v3: Field):
// v4 = eq v3, Field 0
// jmpif v4 then: b2, else: b3
// b2():
// v5 = load v2
// store Field 2 at v5
// v8 = add v3, Field 1
// jmp b1(v8)
// b3():
// v9 = load v0
// v10 = eq v9, Field 2
// constrain v9 == Field 2
// v11 = load v2
// v12 = load v2
// v13 = eq v12, Field 2
// constrain v11 == Field 2
// return
// }
let main_id = Id::test_new(0);
let mut builder = FunctionBuilder::new("main".into(), main_id);

let v0 = builder.insert_allocate(Type::field());
let zero = builder.numeric_constant(0u128, Type::field());
builder.insert_store(v0, zero);

let v2 = builder.insert_allocate(Type::field());
// Construct alias
builder.insert_store(v2, v0);
let v2_type = builder.current_function.dfg.type_of_value(v2);
assert!(builder.current_function.dfg.value_is_reference(v2));

let b1 = builder.insert_block();
builder.terminate_with_jmp(b1, vec![zero]);

// Loop header
builder.switch_to_block(b1);
let v3 = builder.add_block_parameter(b1, Type::field());
let is_zero = builder.insert_binary(v3, BinaryOp::Eq, zero);

let b2 = builder.insert_block();
let b3 = builder.insert_block();
builder.terminate_with_jmpif(is_zero, b2, b3);

// Loop body
builder.switch_to_block(b2);
let v5 = builder.insert_load(v2, v2_type.clone());
let two = builder.numeric_constant(2u128, Type::field());
builder.insert_store(v5, two);
let one = builder.numeric_constant(1u128, Type::field());
let v3_plus_one = builder.insert_binary(v3, BinaryOp::Add, one);
builder.terminate_with_jmp(b1, vec![v3_plus_one]);

builder.switch_to_block(b3);
let v9 = builder.insert_load(v0, Type::field());

let _ = builder.insert_binary(v9, BinaryOp::Eq, two);

builder.insert_constrain(v9, two, None);
let v11 = builder.insert_load(v2, v2_type.clone());
let v12 = builder.insert_load(v2, Type::field());
let _ = builder.insert_binary(v12, BinaryOp::Eq, two);

builder.insert_constrain(v11, two, None);
builder.terminate_with_return(vec![]);

let ssa = builder.finish();

// Expected result:
// acir(inline) fn main f0 {
// b0():
// v14 = allocate
// store Field 0 at v14
// v15 = allocate
// store v14 at v15
// jmp b1(Field 0)
// b1(v3: Field):
// v16 = eq v3, Field 0
// jmpif v16 then: b2, else: b3
// b2():
// v22 = load v15
// store Field 2 at v22
// v23 = add v3, Field 1
// jmp b1(v23)
// b3():
// v17 = load v14
// v18 = eq v17, Field 2
// constrain v17 == Field 2
// v19 = load v15
// v21 = eq v19, Field 2
// constrain v19 == Field 2
// return
// }
let ssa = ssa.mem2reg();

let main = ssa.main();
assert_eq!(main.reachable_blocks().len(), 4);

// The stores from the original SSA should remain
assert_eq!(count_stores(main.entry_block(), &main.dfg), 2);
assert_eq!(count_stores(b2, &main.dfg), 1);

assert_eq!(count_loads(b2, &main.dfg), 1);
// The repeated load to v2 should be removed
assert_eq!(count_loads(b3, &main.dfg), 2);
}

#[test]
fn keep_repeat_loads_passed_to_a_call() {
// The test is the exact same as `remove_repeat_loads` above except with the following `b3`:
//
// b3():
// v9 = load v0
// v10 = eq v9, Field 2
// constrain v9 == Field 2
// v11 = load v2
// call f1(v2)
// v13 = load v2
// v14 = eq v13, Field 2
// constrain v11 == Field 2
// return
//
// Where f1 is the following function with a reference parameter:
//
// acir(inline) fn foo f1 {
// b0(v0: &mut Field):
// return
// }
let main_id = Id::test_new(0);
let mut builder = FunctionBuilder::new("main".into(), main_id);

let v0 = builder.insert_allocate(Type::field());
let zero = builder.numeric_constant(0u128, Type::field());
builder.insert_store(v0, zero);

let v2 = builder.insert_allocate(Type::field());
// Construct alias
builder.insert_store(v2, v0);
let v2_type = builder.current_function.dfg.type_of_value(v2);
assert!(builder.current_function.dfg.value_is_reference(v2));

let b1 = builder.insert_block();
builder.terminate_with_jmp(b1, vec![zero]);

// Loop header
builder.switch_to_block(b1);
let v3 = builder.add_block_parameter(b1, Type::field());
let is_zero = builder.insert_binary(v3, BinaryOp::Eq, zero);

let b2 = builder.insert_block();
let b3 = builder.insert_block();
builder.terminate_with_jmpif(is_zero, b2, b3);

// Loop body
builder.switch_to_block(b2);
let v5 = builder.insert_load(v2, v2_type.clone());
let two = builder.numeric_constant(2u128, Type::field());
builder.insert_store(v5, two);
let one = builder.numeric_constant(1u128, Type::field());
let v3_plus_one = builder.insert_binary(v3, BinaryOp::Add, one);
builder.terminate_with_jmp(b1, vec![v3_plus_one]);

builder.switch_to_block(b3);
let v9 = builder.insert_load(v0, Type::field());

let _ = builder.insert_binary(v9, BinaryOp::Eq, two);

builder.insert_constrain(v9, two, None);
let v11 = builder.insert_load(v2, v2_type.clone());

let foo_id = Id::test_new(1);
let foo = builder.import_function(foo_id);
builder.insert_call(foo, vec![v2], vec![]);

let v12 = builder.insert_load(v2, Type::field());
let _ = builder.insert_binary(v12, BinaryOp::Eq, two);

builder.insert_constrain(v11, two, None);
builder.terminate_with_return(vec![]);

builder.new_function("foo".into(), foo_id, InlineType::default());
let _foo_v0 = builder.add_parameter(Type::Reference(Arc::new(Type::field())));
builder.terminate_with_return(vec![]);

let ssa = builder.finish();

// Expected result of `b3` should be the unchanged:
//
// b3():
// v18 = load v15
// v19 = eq v18, Field 2
// constrain v18 == Field 2
// v20 = load v16
// call f1(v16)
// v21 = load v16
// v22 = eq v21, Field 2
// constrain v20 == Field 2
// return
let ssa = ssa.mem2reg();

let main = ssa.main();
assert_eq!(main.reachable_blocks().len(), 4);

// The stores from the original SSA should remain
assert_eq!(count_stores(main.entry_block(), &main.dfg), 2);
assert_eq!(count_stores(b2, &main.dfg), 1);

assert_eq!(count_loads(b2, &main.dfg), 1);
// The repeated loads to v2 should remain as it is passed as a reference argument.
assert_eq!(count_loads(b3, &main.dfg), 3);
}
}
13 changes: 13 additions & 0 deletions compiler/noirc_evaluator/src/ssa/opt/mem2reg/block.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ pub(super) struct Block {

/// The last instance of a `Store` instruction to each address in this block
pub(super) last_stores: im::OrdMap<ValueId, InstructionId>,

// The last instance of a `Load` instruction to each address in this block
pub(super) last_loads: im::OrdMap<ValueId, InstructionId>,
}

/// An `Expression` here is used to represent a canonical key
Expand Down Expand Up @@ -237,4 +240,14 @@ impl Block {

Cow::Owned(AliasSet::unknown())
}

pub(super) fn set_last_load(&mut self, address: ValueId, instruction: InstructionId) {
self.last_loads.insert(address, instruction);
}

pub(super) fn keep_last_load_for(&mut self, address: ValueId, function: &Function) {
let address = function.dfg.resolve(address);
self.last_loads.remove(&address);
self.for_each_alias_of(address, |block, alias| block.last_loads.remove(&alias));
}
}
Loading