Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: use field_caps native nested fields filtering #117201

Merged
merged 20 commits into from
Nov 22, 2024

Conversation

astefan
Copy link
Contributor

@astefan astefan commented Nov 20, 2024

Fixes #117054 by partially reverting #111495 and using -nested as a filter for the _field_caps call.

@astefan astefan added >bug auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v9.0.0 v8.17.0 labels Nov 20, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 20, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @astefan, I've created a changelog YAML for you.

@astefan
Copy link
Contributor Author

astefan commented Nov 21, 2024

@elasticmachine update branch

@elasticmachine
Copy link
Collaborator

There are no new commits on the base branch.

@astefan
Copy link
Contributor Author

astefan commented Nov 21, 2024

@elasticmachine run elasticsearch-ci/bwc-snapshots

@astefan
Copy link
Contributor Author

astefan commented Nov 21, 2024

@elasticmachine update branch

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplifying.

/**
* https://github.com/elastic/elasticsearch/issues/117054
* One index with (Responses.process is nested; process.parent.command_line is supported):
* <pre>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd leave the mapping out of the javadoc, since its already legible in the code.

Comment on lines 1178 to 1181
index("test", """
{"Responses.process.pid": 123,"process.parent.command_line":"run.bat"}""");

Map<String, Object> result = runEsql("FROM test");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Versions w/o this fix fail already at querying an empty index[*]. So a (totally optional) suggestion would be run a FROM test before indexing.

[*] On an 8.16:

java.lang.IllegalStateException: can't find input for [<no-fields>{r$}#26]
	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planProject(LocalExecutionPlanner.java:602)
	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:203)
	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planExchangeSink(LocalExecutionPlanner.java:321)
	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:235)
	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:174)
	at org.elasticsearch.xpack.esql.plugin.ComputeService.runCompute(ComputeService.java:446)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the additional test. For the no-fields issue, likely it has something to do with an empty mapping. See #111545 where "empty mapping" has slightly different meanings depending on how the index is created and how field_caps behaves.

Copy link
Contributor

@craigtaverner craigtaverner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - although I'd prefer the test comments to not have duplicates of the test json.

/**
* https://github.com/elastic/elasticsearch/issues/117054
* One index with (Responses.process is nested; process.parent.command_line is supported):
* <pre>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This big comment seems unnecessary because it duplicates the equally readable JSON in the test itself.


ResponseException e = expectThrows(ResponseException.class, () -> runEsql("FROM test | SORT Responses.process.pid"));
String err = EntityUtils.toString(e.getResponse().getEntity());
assertThat(err, containsString("line 1:18: Unknown column [Responses.process.pid]"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an error because ES|QL does not yet support nested queries, and if we did we'd need special syntax for it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect we will indeed need a special syntax for it.

For now I think we should do our best to ignore nested fields. One day we will have to do something with them, but it'll be some kind of nested join syntax or something.

* Here is about the name "process" which is common to the two fields' hierarchies.
*
* test1
* <pre>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these two examples are reproduces in as readable a form in the test below, perhaps they should be removed from the comment, or replaced with a single-line text description instead.

* same.
*
* test1
* <pre>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, nicer to remove these.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though wait for the others to approve too I think. Sorry for making a mess here....

@@ -278,6 +259,8 @@ private static FieldCapabilitiesRequest createFieldCapsRequest(String index, Set
// lenient because we throw our own errors looking at the response e.g. if something was not resolved
// also because this way security doesn't throw authorization exceptions but rather honors ignore_unavailable
req.indicesOptions(FIELD_CAPS_INDICES_OPTIONS);
// we ignore the nested data type fields starting with https://github.com/elastic/elasticsearch/pull/111495
req.filters("-nested");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to add a comment in the PR at this line; this is, indeed, the meat of the fix. I wasn't aware field_caps has -nested, learned yesterday about it. Very useful :-D.

@@ -0,0 +1,6 @@
pr: 117201
summary: "ESQL: use `field_caps` native nested fields filtering"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove prefix.

@astefan
Copy link
Contributor Author

astefan commented Nov 22, 2024

@elasticmachine run elasticsearch-ci/bwc-snapshots

@astefan astefan merged commit f3eb27e into elastic:main Nov 22, 2024
16 checks passed
@astefan astefan deleted the 117054__fix branch November 22, 2024 18:48
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

astefan added a commit to astefan/elasticsearch that referenced this pull request Nov 22, 2024
* Just filter the nested fields natively with field_caps support

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Craig Taverner <craig@amanzi.com>
smalyshev pushed a commit to smalyshev/elasticsearch that referenced this pull request Nov 22, 2024
* Just filter the nested fields natively with field_caps support

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Craig Taverner <craig@amanzi.com>
elasticsearchmachine pushed a commit that referenced this pull request Nov 22, 2024
…117375)

* ESQL: use field_caps native nested fields filtering (#117201)

* Just filter the nested fields natively with field_caps support

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Craig Taverner <craig@amanzi.com>

* Add import

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Craig Taverner <craig@amanzi.com>
alexey-ivanov-es pushed a commit to alexey-ivanov-es/elasticsearch that referenced this pull request Nov 28, 2024
* Just filter the nested fields natively with field_caps support

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Craig Taverner <craig@amanzi.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ESQL: a nested field having the same name as a supported one discards the supported field name
6 participants