Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: improvements to terms and built-in functions #7284

Merged

Conversation

anderseknert
Copy link
Member

@anderseknert anderseknert commented Jan 20, 2025

Having worked on performance improvements in OPA on the side for almost a month now, there's a lot of code piling up 😅 So much that a single PR would be way too much to review. Instead, I'm splitting the work into chunks, and will submit the next PR as soon as this one is merged. Using the same benchmark as before — Regal linting itself, these new changes in total reduce the number of allocations by ~13 million, and quite a substantial amount of evaluation time saved as well.

This first PR is isolated to improvements to terms, values and built-ins, and saves ~3M allocations. The details can be found below for each change, and of course in the code :)

BenchmarkRegalLintingItself-10 Before

1885978209 ns/op    3497157312 B/op    69064779 allocs/op

BenchmarkRegalLintingItself-10 After

1796255084 ns/op    3452379408 B/op    66126623 allocs/op

Terms

  • Use pointer receivers consistently for object and set types. This allows changing the sortGuard once lock from a pointer to a non-pointer type, which is really the biggest win performance-wise in this PR.
  • Comparisons happen all the time, so make sure these take the shortest path possible whenever, possible, such as when one type is compared to another value of the same type.

Built-in functions:

Arrays

  • Both array.concat and array.slice will now return the operand on operations where the result isn't different from the input operand (like when concatenating an empty array) instead of allocating a new term/value.

Strings

  • Return operand on unchanged result rather than allocating new term/value.
  • Where applicable, have functions take a cheaper path when string is ASCII and we can avoid the cost of rune conversion.

Crypto

  • Hashing functions now optimized, spending less than half the time compared to previously.

Objects

  • Avoid heap allocating result boolean escaping its scope, and instead use the return value of the Until function.

HTTP

  • Use interned terms for keys in configuration object, avoding allocating these each time http.send is invoked.

Globs

  • Use read/write lock to avoid contention. Use package level vars for "constant" values, avoiding them to escape to the heap each invocation.

Not directly/only related to built-in functions

  • Add ValueName function replacing the previous TypeName functions for getting the name of Value's without paying for any interface allocations.
  • Add a few more interned terms.

@anderseknert anderseknert force-pushed the builtin-perf-improvements branch from c0d1349 to cd1816f Compare January 20, 2025 14:55
Having worked on performance improvements in OPA on the side for almost
a month now, there's a lot of code piling up 😅 So much that a single PR
would be way too much to review. Instead, I'm splitting the work into
chunks, and will submit the next PR as soon as this one is merged. Using
the same benchmark as before — Regal linting itself, these new changes
in total reduce the number of allocations by ~13 million, and quite a
substantial amount of evaluation time saved as well.

This first PR is isolated to improvements to terms, values and
built-ins, and saves ~3M allocations. The details can be found below for
each change, and of course in the code :)

**BenchmarkRegalLintingItself-10 Before**
```
1885978209 ns/op    3497157312 B/op    69064779 allocs/op
```
**BenchmarkRegalLintingItself-10 After**
```
1796255084 ns/op    3452379408 B/op    66126623 allocs/op
```

**Terms**
- Use pointer receivers consistently for object and set types. This allows
  changing the sortGuard once lock from a pointer to a non-pointer type, which
  is really the biggest win performance-wise in this PR.
- Comparisons happen all the time, so make sure these take the shortest path
  possible whenever, possible, such as when one type is compared to another
  value of the same type.

Built-in functions:

**Arrays**
- Both `array.concat` and `array.slice` will now return the operand on operations
  where the result isn't different from the input operand (like when concatenating
  an empty array) instead of allocating a new term/value.

**Strings**
- Return operand on unchanged result rather than allocating new term/value.
- Where applicable, have functions take a cheaper path when string is ASCII
  and we can avoid the cost of rune conversion.

**Crypto**
- Hashing functions now optimized, spending less than half the time compared to
  previously.

**Objects**
- Avoid heap allocating result boolean escaping its scope, and instead use the
  return value of the `Until` function.

**HTTP**
- Use interned terms for keys in configuration object, avoding allocating these
  each time `http.send` is invoked.

**Globs**
- Use read/write lock to avoid contention. Use package level vars for "constant"
  values, avoiding them to escape to the heap each invocation.

**Not directly/only related to built-in functions**
- Add `ValueName` function replacing the previous `TypeName` functions for
  getting the name of Value's without paying for `any` interface allocations.
- Add a few more interned terms.

Signed-off-by: Anders Eknert <anders@styra.com>
@anderseknert anderseknert force-pushed the builtin-perf-improvements branch from cd1816f to 32cf0a3 Compare January 20, 2025 15:38
}

default:
return builtins.NewOperandTypeErr(2, operands[1].Value, "set", "array")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. Since we call strings.Join below anyways, which uses strings.Builder{}, I suppose we could go with strings.Builder right away. But it probably doesn't make much of a difference, and this is cleaner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a bad idea. I might try that later, but will leave as is for now.

@@ -438,36 +439,43 @@ func builtinCryptoParsePrivateKeys(_ BuiltinContext, operands []*ast.Term, iter
return iter(ast.NewTerm(value))
}

func hashHelper(a ast.Value, h func(ast.String) string) (ast.Value, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good riddance

@anderseknert anderseknert merged commit 75962f5 into open-policy-agent:main Jan 20, 2025
28 checks passed
@anderseknert anderseknert deleted the builtin-perf-improvements branch January 20, 2025 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants