Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For empty mappings use a LocalRelation #105081

Merged

Conversation

astefan
Copy link
Contributor

@astefan astefan commented Feb 2, 2024

Fixes #104809 by converting a plan to a local relation when there is no mapping for the index pattern.

@elasticsearchmachine
Copy link
Collaborator

Hi @astefan, I've created a changelog YAML for you.

@astefan astefan requested a review from bpintea February 2, 2024 23:38
@astefan astefan marked this pull request as ready for review February 2, 2024 23:38
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 2, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@astefan
Copy link
Contributor Author

astefan commented Feb 2, 2024

@elasticsearchmachine run elasticsearch-ci/part-1

@astefan
Copy link
Contributor Author

astefan commented Feb 3, 2024

@elasticmachine run elasticsearch-ci/part-1

@astefan
Copy link
Contributor Author

astefan commented Feb 5, 2024

@elasticmachine update branch

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +86 to +87
// marker list of attributes for plans that do not have any concrete fields to return, but have other computed columns to return
// ie from test | stats c = count(*)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -154,7 +154,8 @@ protected static List<Batch<LogicalPlan>> rules() {
new SubstituteSurrogates(),
new ReplaceRegexMatch(),
new ReplaceNestedExpressionWithEval(),
new ReplaceAliasingEvalWithProject()
new ReplaceAliasingEvalWithProject(),
new SkipQueryOnEmptyMappings()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not part of the skip batch? It'd be worth a comment if there's a reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this can be determine upfront, which facilitates the rest of the rules and folding of LocalRelation through-out the plan.
Skip is done at the end for cases where a normal relationship is changed due to filtering.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does make the other rules do less work, yes, but since there's a ONCE limiter, it might then have made more sense as a first rule. Anyways, minor.

@@ -139,7 +140,7 @@ public void init() {

private Analyzer makeAnalyzer(String mappingFileName, EnrichResolution enrichResolution) {
var mapping = loadMapping(mappingFileName);
EsIndex test = new EsIndex("test", mapping);
EsIndex test = new EsIndex("test", mapping, Set.of("test"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With all the similar changes to add a concrete index, wondering if EsIndex(String name, Map<String, EsField> mapping) c'tor, which is only used in tests, shouldn't be updated instead to add a set with the name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about this. I know it's not a big deal, but if something is used only in tests, the method/constructor access modifier should reflect this. But this class is used in many different packages and the only possible access modifier is public. I would rather have the usage be explicit and the user of the constructor be aware of the implications of providing an empty Set or specific values in that Set, rather than use the constructor without being aware of its content.

@dnhatn
Copy link
Member

dnhatn commented Feb 5, 2024

Hello Andrei, I think this PR handles ESQL with an empty mapping rather than fixing #104809, where field-caps were reading from a not up-to-date shard. I have addressed that field-caps issue in #105153.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@astefan
Copy link
Contributor Author

astefan commented Feb 6, 2024

@dnhatn

Hello Andrei, I think this PR handles ESQL with an empty mapping rather than fixing #104809, where field-caps were reading from a not up-to-date shard. I have addressed that field-caps issue in #105153.

That's interesting, I wasn't aware of this scenario. Thank you for the awareness ping.
Nevertheless, the CI test caught an interesting edge case with how we deal with mappings coming from the IndexResolver; note that it is ok for the concreteIndices to be empty. This happens in two situations:

  • a query has concrete fields in it (from test | where age > 18) but those fields cannot be found in the mappings (a typo maybe). In this case the Analyzer handles this scenario and returns an error message about a missing field
  • there are no concrete fields in the query and still the concreteIndices is empty (because of an empty mapping). The Analyzer allows the query to pass , since there are no fields to check for existence. The correct approach is for the planner to skip creating an unnecessary query to Elasticsearch by marking the entire plan as a local plan. Which is correct: no mappings, no data, no need to go down to Lucene.

@astefan astefan added the auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Feb 6, 2024
@elasticsearchmachine elasticsearchmachine merged commit bfa21b5 into elastic:main Feb 6, 2024
15 checks passed
@astefan astefan deleted the local_execution_on_empty_mapping branch February 6, 2024 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] EsqlClientYamlIT test {p0=esql/30_types/_source} failing
6 participants