-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StackOverflow crash - large regex produced by Discover filter not limited by index.regex_max_length #82923
Comments
Pinging @elastic/es-search (Team:Search) |
Pinging @elastic/es-analytics-geo (Team:Analytics) |
I'm not a fan of the regex ctor being recursive. But it is..... Our standard defense against stuff like this is |
I had a look at this. Right now we build the |
In order to fix an error where large regexes in `include` or `exclude` fields of the `terms` agg crash the node (elastic#82923) I'd like to centralize construction of the `RegExp` so we can test it for large-ness in one spot. The trouble is, there are half a dozen ctors for `IncludeExclude` and some take `String` and some take `RegExp` and some take a sets of `String` and some take sets of `BytesRef`. It's all very convenient for client code, but confusing to deal with. This removes all but two of the ctors for `IncludeExclude` and mostly standardizes on one that has: ``` String includeRe, String excludeRe, Set<BytesRef> includePrecise, Set<BytesRef> excludePecise ``` Now I can fix elastic#82923 in a fairly simple follow up.
In order to fix an error where large regexes in `include` or `exclude` fields of the `terms` agg crash the node (#82923) I'd like to centralize construction of the `RegExp` so we can test it for large-ness in one spot. The trouble is, there are half a dozen ctors for `IncludeExclude` and some take `String` and some take `RegExp` and some take a sets of `String` and some take sets of `BytesRef`. It's all very convenient for client code, but confusing to deal with. This removes all but two of the ctors for `IncludeExclude` and mostly standardizes on one that has: ``` String includeRe, String excludeRe, Set<BytesRef> includePrecise, Set<BytesRef> excludePecise ``` Now I can fix #82923 in a fairly simple follow up.
#84592 does not close this issue. |
We tracked down what looks like this exact problem in OpenSearch, here's a fix that enforces the limit: opensearch-project/OpenSearch#2810. |
Thanks for the ping! @ywelsh I think you have a fix for this open.
…On Thu, Apr 7, 2022, 5:23 PM Daniel Doubrovkine (dB.) < ***@***.***> wrote:
We tracked down what looks like this exact problem in OpenSearch, here's a
fix that enforces the limit: opensearch-project/OpenSearch#2810
<opensearch-project/OpenSearch#2810>.
—
Reply to this email directly, view it on GitHub
<#82923 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUXIXBSIM2SSNUW4U5BR3VD5G5XANCNFSM5MUS4TVA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
I found #84624 just now but that doesn’t fix the problem, it wraps the code to avoid the node crash |
Elasticsearch version (
bin/elasticsearch --version
): 7.10.2Plugins installed: [repository-s3]
JVM version (
java -version
):openjdk version "15.0.1" 2020-10-20
OpenJDK Runtime Environment AdoptOpenJDK (build 15.0.1+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 15.0.1+9, mixed mode, sharing)
OS version (
uname -a
if on a Unix-like system):Debian Stretch
Description of the problem including expected versus actual behavior:
It seems to be possible to crash several Elasticsearch nodes by providing a very large string when attempting to filter on a field value (StackOverflow related to regexp processing). When filtering on a field value, a query containing a 'suggestions' aggregation is sent to the cluster in the background before the filter is saved, in order to populate an autocomplete drop down. This aggregation includes a regex which is constructed from taking the large string and suffixing with a ".*" . The resulting regexp does not seem to respect the default index.max_regex_length limit of 1000 - the query is submitted causing an instant crash of nodes.
Caused by: java.lang.StackOverflowError at org.apache.lucene.util.automaton.RegExp.parseSimpleExp(RegExp.java:1209) ~[lucene-core-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:35:28]
This might seem like an unlikely scenario, but we have seen end users of Kibana trigger this bug several times (accidentally pasting copied output from a data table visualization into the 'value' field when filtering further for example)
Steps to reproduce:
For a test I pasted 50k chars - I haven't done extensive testing to find where the crashing occurs.
Provide logs (if relevant):
Full crash logs (fatal-error.log) attached.
fatal-error.log
Example suggestions aggregation containing offending regex attached.
suggestions-query.txt
Unfortunately, disabling expensive queries is not an option for us as we have users which depend on regex and script query functionality. If you have any suggestions on how to best prevent this issue from re-occurring, that would be great.
Any questions, shout.
Cheers
The text was updated successfully, but these errors were encountered: