-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Degrade since v0.67.0 on glob.match with lots of patterns #6908
Comments
Thanks for that detailed report! A contribution would be most welcome 😃 |
Awesome! Making the cache size configurable may not be ideal. You could do it via the context but does not seem correct. @johanfylling may know the reason for setting the default to Currently we remove only one value from the cache, may be that can be improved. |
@ashutosh-narkar setting the default to any number may not be ideal, because the default rule or bundle static data maybe beyond that, and that can cause the same issue |
Sure. The thing we probably need to understand is what is the norm here. Whatever value is selected there are always going to be some cases where we see performance issues. The question is then if that's an edge case, which is not easy to figure out. Since we cannot make the value configurable, looking into improving caching behavior would be a possible solution. WDYT @johanfylling? |
Setting the cache size to In the original OOM fix, we opted to not use a more sophisticated eviction strategy than to simply remove a random entry from the cache. An option could be to revisit this, and instead evict maybe the least used entry in the cache (just as an example). Anything complicated here that require us to scan or reorder the cache would of course come with it's own performance penalty. But from your issue description, it sounds like the number of patterns might still be too large for this to be a viable solution. An argument for making the cache size configurable through config-file/env-var would be that whomever is launching OPA is likely also the person/team to know the memory limitations of the host machine. |
What about re-using the caching package? It has configurable cache size limit in bytes which is better than a counter. |
Hi again, I read a little about glob library, maybe we should change our view of the matter at hand. The other solution that I was thinking about was taking another argument from the user in the function glob.match called persist, it’s disabled by default but if the user enables it the compiled glob will always get persisted in another map no matter what, without a cap! This brings back the OOM problem |
Raising the glob cache size to somewhere in the thousands should be benign. What would be a suitable limit for your use case @amirsalarsafaei? We still might want this to be configurable, though 🤔. |
@johanfylling I actually took a shortcut by creating a template file and replacing glob match with some thing like this input.parsed_path == ["v1", "users", input.parsed_path[2]] input.parsed_path is path split by |
@amirsalarsafaei, for paths that don't require mid-point |
Raising the cache limit too far would bring us back to the memory issue previously solved. I think we should aim for a solution that accommodates most scenarios without being destructive. OOM being a halting issue, and cache overflow "just" being a performance issue, I'm inclined to favor the OOM issue. I suggest we move the glob cache to the As extra credits, we could even add something like a |
This commit adds a new inter-query value cache that built-in functions can use to cache information across queries. For example, the `regex` and `glob` builtins can use this to cache compiled regex and glob match patterns respectively. The number of entries in the cache can be configured via the OPA config. By default there is no limit. Fixes: open-policy-agent#6908 Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
This commit adds a new inter-query value cache that built-in functions can use to cache information across queries. For example, the `regex` and `glob` builtins can use this to cache compiled regex and glob match patterns respectively. The number of entries in the cache can be configured via the OPA config. By default there is no limit. Fixes: open-policy-agent#6908 Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
This commit adds a new inter-query value cache that built-in functions can use to cache information across queries. For example, the `regex` and `glob` builtins can use this to cache compiled regex and glob match patterns respectively. The number of entries in the cache can be configured via the OPA config. By default there is no limit. Fixes: open-policy-agent#6908 Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
This commit adds a new inter-query value cache that built-in functions can use to cache information across queries. For example, the `regex` and `glob` builtins can use this to cache compiled regex and glob match patterns respectively. The number of entries in the cache can be configured via the OPA config. By default there is no limit. Fixes: open-policy-agent#6908 Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
Short description
upgrading opa from version v0.66.0 to v0.67.0 can cause performance hits due to the solution in #6846.
imagine this policy:
I have a fixed large number of path so memory leak isn't a concern here but the cap causes re calculation of the regex pattern and in result a significant performance loss. I was able to achieve 25-50% better latency in 99.99% percentile increasing the cap size to my data size (benchmarks are attached at the end).
I can think of 2 possible solutions:
I'd be open to work on a PR and a solution I actually have done one on opa-envoy-plugin and would like to contribute to the community:)
Benchmark Results:
command:
bench --bundle bundle-test.tar.gz --input input.json 'data.smth.authz.allow' --e2e --count 10
*the bundle is created with optimization level of 0 because using optimization caused more than 100x latency but that's another issue that I'm investigating:D
Opa v0.67.0:
Modified Version:
The text was updated successfully, but these errors were encountered: