Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: do not eliminate possible out of bounds errors #5610

Closed
wants to merge 7 commits into from

Conversation

asterite
Copy link
Collaborator

Description

Problem

Resolves #5296

Summary

Is there another way to do this? That is, can we not remove these instructions when we don't know if they are out of bounds?

Additional Context

None.

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@asterite asterite marked this pull request as ready for review July 26, 2024 13:51
Copy link
Contributor

@jfecher jfecher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there another way to do this? That is, can we not remove these instructions when we don't know if they are out of bounds?

None that I know of.

I am a bit wary of this PR though. Since it seems like a potential performance degradation in larger programs and on slices. E.g. programs which make use of more dynamic indexing or more slices will be hit harder by this.

compiler/noirc_evaluator/src/ssa/ir/instruction.rs Outdated Show resolved Hide resolved
Copy link
Contributor

github-actions bot commented Jul 26, 2024

Changes to circuit sizes

Generated at commit: 28985b01a6fd78038600dcd4454fd871ad2c829d, compared to commit: ade69a9e5f1546249e9b43b40e9ff0da87c4632e

🧾 Summary (10% most significant diffs)

Program ACIR opcodes (+/-) % Circuit size (+/-) %
regression_4449 +213 ❌ +3.49% +15,258 ❌ +5.45%
slices +31 ❌ +9.97% +125 ❌ +3.70%
bench_poseidon2_hash +3 ❌ +75.00% +23 ❌ +3.18%

Full diff report 👇
Program ACIR opcodes (+/-) % Circuit size (+/-) %
regression_4449 6,311 (+213) +3.49% 295,117 (+15,258) +5.45%
slices 342 (+31) +9.97% 3,505 (+125) +3.70%
bench_poseidon2_hash 7 (+3) +75.00% 746 (+23) +3.18%
poseidon2 12 (+7) +140.00% 1,479 (+43) +2.99%
no_predicates_numeric_generic_poseidon 99 (+46) +86.79% 11,046 (+290) +2.70%
bench_poseidon2_hash_30 152 (+90) +145.16% 22,061 (+574) +2.67%
bench_poseidon2_hash_100 502 (+300) +148.51% 73,511 (+1,904) +2.66%
nested_array_dynamic 4,305 (+146) +3.51% 14,043 (+287) +2.09%
slice_dynamic_index 1,216 (+16) +1.33% 6,531 (+111) +1.73%
regression_struct_array_conditional 102 (+30) +41.67% 3,227 (+46) +1.45%
hashmap 208,841 (+2,907) +1.41% 372,033 (+5,002) +1.36%
sha2_byte 15,913 (+7) +0.04% 83,932 (+772) +0.93%
regression_5252 81,849 (+63) +0.08% 107,190 (+399) +0.37%
eddsa 70,298 (+4) +0.01% 71,704 (+24) +0.03%
array_dynamic_nested_blackbox_input 183 (+14) +8.28% 38,799 (0) 0.00%
sha256 155 (+4) +2.65% 38,904 (0) 0.00%

@asterite
Copy link
Collaborator Author

I am a bit wary of this PR though. Since it seems like a potential performance degradation in larger programs and on slices. E.g. programs which make use of more dynamic indexing or more slices will be hit harder by this.

Agreed. I'll take a look at the regressions to see if there are cases where they can still be removed in case the index is not known.

@jfecher
Copy link
Contributor

jfecher commented Jul 26, 2024

I wonder if it'd help constraint counts at all if we treated all array_get and array_set instructions as unchecked and during ssa-gen lower each get & set with an assert(i < array.len()) instruction before it. That way the array get & set can be treated as pure and only the assert before it would have a side effect.

@asterite
Copy link
Collaborator Author

asterite commented Jul 26, 2024

Good idea! I was looking at the slice_dynamic_index program and kept only a portion of it:

fn main(x: Field) {
    let mut slice = &[0, 1, 2, 3, 4];
    let _ = slice[x - 1];
}

The SSA before this PR is this:

acir(inline) fn main f0 {
  b0(v0: Field):
    v24 = sub v0, Field 1
    v25 = cast v24 as u32
    v26 = lt v25, u32 5
    constrain v26 == u1 1 '"Index out of bounds"'
    return
}

The new SSA is this:

acir(inline) fn main f0 {
  b0(v0: Field):
    v24 = sub v0, Field 1
    v25 = cast v24 as u32
    v26 = lt v25, u32 5
    constrain v26 == u1 1 '"Index out of bounds"'
    v27 = array_get [Field 0, Field 1, Field 2, Field 3, Field 4], index v25
    return
}

So two things:

  1. It seems we already insert a constrain to see if we are in-bounds
  2. here v25 will always be in-bounds because of the previous constrain, so the next array_get could be removed

What's interesting is that this program has a different SSA:

fn main(x: Field) {
    let mut slice = &[0, 1, 2, 3, 4];
    slice[x - 1] = 1;
}

The SSA before this PR is this one (empty!):

acir(inline) fn main f0 {
  b0(v0: Field):
    return
}

In this PR it's this:

acir(inline) fn main f0 {
  b0(v0: Field):
    v23 = sub v0, Field 1
    v24 = cast v23 as u32
    v25 = array_set mut [Field 0, Field 1, Field 2, Field 3, Field 4], index v24, value Field 1
    return
}

I was looking at our code and there's codegen_slice_access_check but (according to the above SSA's) it seems we don't do that check for set, only for get. Maybe that's the main issue? (I'll check)

@asterite
Copy link
Collaborator Author

However... that constrain only seems to be inserted for slices, not for arrays... not sure why.

@vezenovm
Copy link
Contributor

However... that constrain only seems to be inserted for slices, not for arrays... not sure why.

Discussed on a call, but posting here visibility as well.

We currently only code gen a length constrain on slices due to their lengths being dynamic.

I see a couple options:

  1. We insert these assertions for array accesses, not just slices.
  2. We do not eliminate possible OOB errors as this PR currently does.

I lean towards option 2 as although it may cause some larger programs, it should only cause larger programs where we have unused array gets/sets. I think it is ok to say that a developer should remove these themselves. We can add an unused variable warning array gets/sets that are later unused, and then in the long-term perhaps we can have a "release" mode that removes these automatically.

@jfecher
Copy link
Contributor

jfecher commented Jul 29, 2024

I lean towards option 2 as although it may cause some larger programs, it should only cause larger programs where we have unused array gets/sets.

I think unused gets & sets are more widespread than you think since in the 3 slices tests that increased in size in the CI this leads to a 3-5% increase in constraint counts which seems significant.

I'm leaning towards 1 currently so that we treat arrays & slices the same, then continue to treat get/set as pure. I'd be interested in the constraint count difference but hopefully it shouldn't be much since we already perform this check just within the get/set.

@asterite
Copy link
Collaborator Author

Just a note: originally the PR kept array_get on slices if the index might be out of bounds. Then I noticed those already have a bounds check, so now those are removed (like before). However, I didn't see another "Changes to circuit sizes" comment. I think the diff now should be smaller (though still not the same as before). Is there a way to force that comment to be produced?

@jfecher
Copy link
Contributor

jfecher commented Jul 29, 2024

@asterite it is automatically updated with each commit you make as part of the "Report gates diff" CI action

@asterite
Copy link
Collaborator Author

To summarize all the scenarios... here are four scenarios and whether they correctly give you "index out of bounds".

Array get

Known index out of bounds

fn main() {
    let array = [1, 2, 3];
    let _ = array[10];
}

Expected: index out of bounds
Actual: no error ❌

Unknown index

fn main(x: Field) {
    let array = [1, 2, 3];
    let _ = array[x]; // x is 10 in Prover.toml
}

Expected: index out of bounds
Actual: no error ❌

Array set

Known index out of bounds

fn main() {
    let mut array = [1, 2, 3];
    array[10] = 42;
}

Expected: index out of bounds
Actual: no error ❌

Unknown index

fn main(x: Field) {
    let mut array = [1, 2, 3];
    array[x] = 42; // x is 10 in Prover.toml
}

Expected: index out of bounds
Actual: no error ❌

Slice get

Known index out of bounds

fn main() {
    let slice = &[1, 2, 3];
    let _ = slice[10];
}

Expected: index out of bounds
Actual: index out of bounds ✅

Unknown index

fn main(x: Field) {
    let slice = &[1, 2, 3];
    let _ = slice[x]; // x is 10 in Prover.toml
}

Expected: index out of bounds
Actual: index out of bounds ✅

Slice set

Known index out of bounds

fn main() {
    let mut slice = &[1, 2, 3];
    slice[10] = 42;
}

Expected: index out of bounds
Actual: no error ❌

Unknown index

fn main(x: Field) {
    let mut slice = &[1, 2, 3];
    slice[x] = 42; // x is 10 in Prover.toml
}

Expected: index out of bounds
Actual: no error ❌

Conclusion

I'm not sure how to proceed here. It seems a lot of checks are missing but I don't know if they should go into SSA or acir-gen.

@jfecher
Copy link
Contributor

jfecher commented Jul 31, 2024

I'm not sure how to proceed here. It seems a lot of checks are missing but I don't know if they should go into SSA or acir-gen.

I think the way to proceed is to always insert asserts on the array length like we do with slices, and keep treating the actual get & set as having no side effects. That way we can still optimize them out while leaving in the assert. Asserts with known-true conditions are also already optimized out.

@asterite
Copy link
Collaborator Author

asterite commented Aug 2, 2024

Closed in favor of #5676

@asterite asterite closed this Aug 2, 2024
@asterite asterite deleted the ab/do-not-remove-array-get-out-of-bounds branch August 2, 2024 17:10
@asterite asterite restored the ab/do-not-remove-array-get-out-of-bounds branch August 5, 2024 19:37
@asterite asterite reopened this Aug 5, 2024
@asterite
Copy link
Collaborator Author

Closing because we chose #5691 instead.

@asterite asterite closed this Aug 26, 2024
@asterite asterite deleted the ab/do-not-remove-array-get-out-of-bounds branch August 26, 2024 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dead Instruction Elimination pass removes out of bounds array sets
3 participants