Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EQL: Fix cidrMatch function fails to match when used in scripts #56246

Merged
merged 10 commits into from
May 14, 2020

Conversation

aleksmaus
Copy link
Contributor

Replaced the previous implementation with script function.
Addresses #55709

@astefan I think this will conflict with your changes on query folding with equal operator optimization, it generates a different script with function now.

@aleksmaus aleksmaus added >bug :Analytics/EQL EQL querying Team:QL (Deprecated) Meta label for query languages team labels May 5, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (:Query Languages/EQL)

Comment on lines 108 to 113
Object o = field.fold();
ArrayList<Object> arr = new ArrayList<>(addresses.size());
for (Expression address : addresses) {
final Equals eq = new Equals(source(), field, address);
func = (func == null) ? eq : new Or(source(), func, eq);
arr.add(address.fold());
}
return doProcess(o, arr);
Copy link
Contributor

@rw-access rw-access May 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Object o = field.fold();
ArrayList<Object> arr = new ArrayList<>(addresses.size());
for (Expression address : addresses) {
final Equals eq = new Equals(source(), field, address);
func = (func == null) ? eq : new Or(source(), func, eq);
arr.add(address.fold());
}
return doProcess(o, arr);
return doProcess(field.fold(), Expressions.fold(addresses));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

public ScriptTemplate asScript() {
ScriptTemplate leftScript = asScript(field);

List<Object> values = new ArrayList<>(new LinkedHashSet<>(foldListOfValues(addresses, field.dataType())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
List<Object> values = new ArrayList<>(new LinkedHashSet<>(foldListOfValues(addresses, field.dataType())));
List<Object> values = Expressions.fold(addresses)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not eliminate duplicated values as far as I see

@@ -156,22 +156,29 @@ InternalEqlScriptUtils.concat([InternalQlScriptUtils.docValue(doc,params.v0),par
cidrMatchFunctionOne
process where cidrMatch(source_address, "10.0.0.0/8")
;
"term":{"source_address":{"value":"10.0.0.0/8"
"script":{"source":"InternalQlScriptUtils.nullSafeFilter(InternalEqlScriptUtils.cidrMatch(
Copy link
Contributor

@rw-access rw-access May 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add an optimizer rule to handle this so that we don't lose the more performant term query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will probably have to do another pass for that after @astefan merges his PR that also will affect the query planning optimization for simple equals. Can adjust this PR if that change merges sooner, sometimes this week.

@aleksmaus
Copy link
Contributor Author

Updated PR to use term query for cidrMatch when possible otherwise fallback to script.
Thanks @astefan for pointers!

if (cm.field() instanceof FieldAttribute && Expressions.foldable(cm.addresses())) {
String targetFieldName = handler.nameOf(((FieldAttribute) cm.field()).exactAttribute());

Set<Object> set = new LinkedHashSet<>(CollectionUtils.mapSize(cm.addresses().size()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this worthwhile to add as a method? It's also called within CidrMatch.asScript

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly, needed Set here for TermsQuery constructor and List for the scripting func

new ExpressionTranslators.StringQueries(),
new ExpressionTranslators.Matches(),
new ExpressionTranslators.MultiMatches(),
new Scalars()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preferred without static importing everything here, more visible QL translators vs EQL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A relevant aspect here: you might want to update your branch from master, to incorporate the StartsWith changes from #56404. In that PR, the Scalars from QL was updated to have startsWith use a PrefixQuery whenever possible. And for EQL and its QueryTranslator that you added, it will probably need the Scalars from QL, as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, and adjusted Scalar translation in EQ to be able to reuse it for startsWith function

rw-access
rw-access approved these changes May 12, 2020
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general, but I have some comments as well.


public static boolean isInRange(String address, String... cidrAddresses) {
// Check if address is parsable first
byte[] addr = InetAddresses.forString(address).getAddress();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this code is in EQL, maybe we should handle the IllegalArgumentException here and wrap it in EqlIllegalArgumentException?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

new ExpressionTranslators.StringQueries(),
new ExpressionTranslators.Matches(),
new ExpressionTranslators.MultiMatches(),
new Scalars()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A relevant aspect here: you might want to update your branch from master, to incorporate the StartsWith changes from #56404. In that PR, the Scalars from QL was updated to have startsWith use a PrefixQuery whenever possible. And for EQL and its QueryTranslator that you added, it will probably need the Scalars from QL, as well.

import org.elasticsearch.xpack.ql.type.DataType;
import org.elasticsearch.xpack.ql.type.DataTypeConverter;

public class EqlTranslatorHandler implements TranslatorHandler {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is the same as QlTranslatorHandler except asQuery method. Maybe you could reuse most of the code from QlTranslatorHandler and only change the relevant bit in EqlTranslatorHandler. For example, why not extending QlTranslatorHandler and override asQuery only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

process where cidrMatch(source_address, "10.0.0.0/8") == true
;
{"bool":{"must":[{"term":{"event.category":{"value":"process"
{"terms":{"source_address":["10.0.0.0/8"]
;

cidrMatchFunctionTwo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add one more test where there are two cidrMatch function calls in the same query: where cidrMatch(whatever) or cidrMatch(another_whatever).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

process where cidrMatch(source_address, "10.0.0.0/8") or cidrMatch(source_address, "192.168.0.0/16")
;
{"bool":{"must":[{"term":{"event.category":{"value":"process"
{"terms":{"source_address":["10.0.0.0/8"],"boost":1.0}},{"terms":{"source_address":["192.168.0.0/16"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it is relevant to have both terms queries in there and the must for event.category, but there has to be a bool should as well.
The query should translate as a bool with two must statements:

  • one is term on event.category
  • the other is a bool with two should statements where terms on the source_address is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aleksmaus aleksmaus merged commit 4b47e84 into elastic:master May 14, 2020
aleksmaus added a commit to aleksmaus/elasticsearch that referenced this pull request May 14, 2020
aleksmaus added a commit that referenced this pull request May 14, 2020
…) (#56735)

EQL: Fix cidrMatch function fails to match when used in scripts (#56246)

Addresses #55709
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/EQL EQL querying >bug Team:QL (Deprecated) Meta label for query languages team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants