-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EQL: Revisit case insensitivity #61883
Comments
Pinging @elastic/es-ql (:Query Languages/EQL) |
I'm not a big fan of |
That to me is confusing. == is exact/case sensitive and ~= inexact/fuzzy/insensitive; not the other way around. And if that's the case, there's no need to have a per-string function sensitivity switch. |
And then we have the other issue of having a wide parameter for case sensitivity that will not apply for comparisons and sorting... |
We can rename the parameter to better indicate its scope I don't like the idea of a query wide parameter however it some kind of option for changing behavior |
There is one other significant advantage with not having wide (per request) values, imo: flexibility. A case sensitive equals and a case insensitive equals could not be used in the same query. One could argue though that eql is case insensitive by default and the number of scenarios where both are used is small. But if there is one scenario where an user will want to use a case sensitive equals, imo that scenario is likely to involve another at least one operation that’s case insensitive. |
If we aim to use EQL (or sub parts of it) in other context, such as Elastic Agent (elastic/beats#20994) or Kibana, I think there is no other way than to build it into the language while keeping the existing behavior around through a flag (like we do today) that we should consider deprecated. As mentioned before, operators do not allow switches and since
Introduce case-sensitivity param to string comparison functions where this makes sense. Currently this affects
This means queries such as P.S. It's also worth considering dropping |
I think custom overloaded operators is a strange leak into the syntax, and I don't understand the appeal. The global toggle should be sufficient. I'm strongly against this I don't see any problem continuing to support I think we need to continue keeping the end user in mind, and I'm not so sure that |
After discussing this topic again in our meeting, it was decided that EQL in Elasticsearch will be case insensitive where needed, without offering an option to turn this off. If/when the topic of case sensitivity arises, we'll deal with it then. Raised #62255 to track down code changes. Thanks for the discussions folks! |
Opening this ticket again after the comments on #62255 |
Given that we'll be supporting runtime fields in ES soon, should we reserve |
Closing per #62255 (comment) |
@sethpayne Let's move the conversion over to #62650 |
Elasticsearch is by default case sensitive. EQL on the other hand strives to be case insensitive since matching strings against different OSs is not straight-forward (some are insensitive, some aren't).
Hence why string equality / non-equality are by default case-insensitive.
The current approach requires usage of scripting for functions that are case sensitive and is not fully supported for equality/non-equality. We could expand this to the rest of the operators (like
>
,>=
, etc..) but considering this is a rare occurrence for strings, for the time being the scope is on==
and!=
.Using operators
Extending the
==
operator to be case aware is convenient but also quite impactful. That's because in all languages==
is an exact equality,John
==john
is false.That is everything is case sensitive and insensitivity needs to be added on top.
Either default (sensitive or insensitive) has pros and cons and having a flag that can change the behavior is the ideal way. Currently there is a default through the
case_sensitive
parameter which can be kept though it would have to be renamed since it's only the equality that we're after socase_sensitive
-->case_sensitive_equality
.The issue with this type of parameter is that all string comparisons have the same sensitivity. Potentially we can introduce dedicated operator such as
~=
or~==
to indicate a case insensitive comparison.The pro of this approach is that there are defined scopes, the downside is that it might be too subtle for folks to pick it up.
Wrapping functions
Another option would be to use some kind of function say
insensitive(foo == bar)
orsensitive(foo == bar)
which is a more verbose way of supporting==
and~=
and offering both sensitivities regardless of the global setting.Impact on functions
As described in #61162, case insensitivity will be an option on a limited number of queries. Currently this translates to:
term
andterms
)startsWith
(prefix
query)match
,wildcard
(wildcard
query)It's worth revising the semantics of insensitivity over all the functions in particular:
between
,indexOf
endsWith
- might be rewritten to a wildcard querystringContains
- wildcard againAs last note, a global setting will affect both the operator and the functions. Meaning if we need scoping - have functions with both types of sensitivity as well as operators, we need to introduce either dedicated switches/wrapping functions.
My proposal is to look at the case insensitive usage of functions in existing queries and where needed, try to retrofit them onto the existing queries. While we might not cover all possible options, the vast majority of cases / rules might be covered.
The text was updated successfully, but these errors were encountered: