Perf: improvements to terms and built-in functions #7284

anderseknert · 2025-01-20T14:46:13Z

Having worked on performance improvements in OPA on the side for almost a month now, there's a lot of code piling up 😅 So much that a single PR would be way too much to review. Instead, I'm splitting the work into chunks, and will submit the next PR as soon as this one is merged. Using the same benchmark as before — Regal linting itself, these new changes in total reduce the number of allocations by ~13 million, and quite a substantial amount of evaluation time saved as well.

This first PR is isolated to improvements to terms, values and built-ins, and saves ~3M allocations. The details can be found below for each change, and of course in the code :)

BenchmarkRegalLintingItself-10 Before

1885978209 ns/op    3497157312 B/op    69064779 allocs/op

BenchmarkRegalLintingItself-10 After

1796255084 ns/op    3452379408 B/op    66126623 allocs/op

Terms

Use pointer receivers consistently for object and set types. This allows changing the sortGuard once lock from a pointer to a non-pointer type, which is really the biggest win performance-wise in this PR.
Comparisons happen all the time, so make sure these take the shortest path possible whenever, possible, such as when one type is compared to another value of the same type.

Built-in functions:

Arrays

Both array.concat and array.slice will now return the operand on operations where the result isn't different from the input operand (like when concatenating an empty array) instead of allocating a new term/value.

Strings

Return operand on unchanged result rather than allocating new term/value.
Where applicable, have functions take a cheaper path when string is ASCII and we can avoid the cost of rune conversion.

Crypto

Hashing functions now optimized, spending less than half the time compared to previously.

Objects

Avoid heap allocating result boolean escaping its scope, and instead use the return value of the Until function.

HTTP

Use interned terms for keys in configuration object, avoding allocating these each time http.send is invoked.

Globs

Use read/write lock to avoid contention. Use package level vars for "constant" values, avoiding them to escape to the heap each invocation.

Not directly/only related to built-in functions

Add ValueName function replacing the previous TypeName functions for getting the name of Value's without paying for any interface allocations.
Add a few more interned terms.

v1/ast/term.go

Having worked on performance improvements in OPA on the side for almost a month now, there's a lot of code piling up 😅 So much that a single PR would be way too much to review. Instead, I'm splitting the work into chunks, and will submit the next PR as soon as this one is merged. Using the same benchmark as before — Regal linting itself, these new changes in total reduce the number of allocations by ~13 million, and quite a substantial amount of evaluation time saved as well. This first PR is isolated to improvements to terms, values and built-ins, and saves ~3M allocations. The details can be found below for each change, and of course in the code :) **BenchmarkRegalLintingItself-10 Before** ``` 1885978209 ns/op 3497157312 B/op 69064779 allocs/op ``` **BenchmarkRegalLintingItself-10 After** ``` 1796255084 ns/op 3452379408 B/op 66126623 allocs/op ``` **Terms** - Use pointer receivers consistently for object and set types. This allows changing the sortGuard once lock from a pointer to a non-pointer type, which is really the biggest win performance-wise in this PR. - Comparisons happen all the time, so make sure these take the shortest path possible whenever, possible, such as when one type is compared to another value of the same type. Built-in functions: **Arrays** - Both `array.concat` and `array.slice` will now return the operand on operations where the result isn't different from the input operand (like when concatenating an empty array) instead of allocating a new term/value. **Strings** - Return operand on unchanged result rather than allocating new term/value. - Where applicable, have functions take a cheaper path when string is ASCII and we can avoid the cost of rune conversion. **Crypto** - Hashing functions now optimized, spending less than half the time compared to previously. **Objects** - Avoid heap allocating result boolean escaping its scope, and instead use the return value of the `Until` function. **HTTP** - Use interned terms for keys in configuration object, avoding allocating these each time `http.send` is invoked. **Globs** - Use read/write lock to avoid contention. Use package level vars for "constant" values, avoiding them to escape to the heap each invocation. **Not directly/only related to built-in functions** - Add `ValueName` function replacing the previous `TypeName` functions for getting the name of Value's without paying for `any` interface allocations. - Add a few more interned terms. Signed-off-by: Anders Eknert <anders@styra.com>

srenatus · 2025-01-20T15:50:59Z

v1/topdown/strings.go

 		}
+
 	default:
 		return builtins.NewOperandTypeErr(2, operands[1].Value, "set", "array")
 	}


Hm.. Since we call strings.Join below anyways, which uses strings.Builder{}, I suppose we could go with strings.Builder right away. But it probably doesn't make much of a difference, and this is cleaner.

Not a bad idea. I might try that later, but will leave as is for now.

srenatus · 2025-01-20T15:54:01Z

v1/topdown/crypto.go

@@ -438,36 +439,43 @@ func builtinCryptoParsePrivateKeys(_ BuiltinContext, operands []*ast.Term, iter
 	return iter(ast.NewTerm(value))
 }

-func hashHelper(a ast.Value, h func(ast.String) string) (ast.Value, error) {


good riddance

anderseknert force-pushed the builtin-perf-improvements branch from c0d1349 to cd1816f Compare January 20, 2025 14:55

srenatus reviewed Jan 20, 2025

View reviewed changes

v1/ast/term.go Outdated Show resolved Hide resolved

anderseknert force-pushed the builtin-perf-improvements branch from cd1816f to 32cf0a3 Compare January 20, 2025 15:38

srenatus reviewed Jan 20, 2025

View reviewed changes

srenatus approved these changes Jan 20, 2025

View reviewed changes

anderseknert merged commit 75962f5 into open-policy-agent:main Jan 20, 2025
28 checks passed

anderseknert deleted the builtin-perf-improvements branch January 20, 2025 16:41

BrewTestBot mentioned this pull request Jan 27, 2025

opa 1.1.0 Homebrew/homebrew-core#205655

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf: improvements to terms and built-in functions #7284

Perf: improvements to terms and built-in functions #7284

anderseknert commented Jan 20, 2025 •

edited

Loading

srenatus Jan 20, 2025

anderseknert Jan 20, 2025

srenatus Jan 20, 2025

Perf: improvements to terms and built-in functions #7284

Perf: improvements to terms and built-in functions #7284

Conversation

anderseknert commented Jan 20, 2025 • edited Loading

srenatus Jan 20, 2025

Choose a reason for hiding this comment

anderseknert Jan 20, 2025

Choose a reason for hiding this comment

srenatus Jan 20, 2025

Choose a reason for hiding this comment

anderseknert commented Jan 20, 2025 •

edited

Loading