Skip to content

Conversation

@jordan-powers
Copy link
Contributor

@jordan-powers jordan-powers commented Jul 15, 2025

In #129126, we stopped double-storing match_only_text fields when they are part of a multi-field, instead extracting the value when needed from the appropriate multi-field mapper.

This introduced an edge case related to ignore_above on keyword fields. If the associated multi-field mapper is a keyword mapper, and if that keyword mapper has ignore_above specified, and a document triggers the ignore_above case, then the original value will be stored in <foo>._original instead of <foo>. In this case, the match_only_text mapper needs to look at the <foo>._original stored field.

Resolves #131298

@jordan-powers jordan-powers self-assigned this Jul 15, 2025
@jordan-powers jordan-powers added >non-issue auto-backport Automatically create backport pull requests when merged :StorageEngine/Mapping The storage related side of mappings v8.19.0 v9.1.0 v9.2.0 labels Jul 15, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

}
return storedFieldFetcher(parentField);
} else if (parent.hasDocValues()) {
var ifd = searchExecutionContext.getForField(parent, MappedFieldType.FielddataOperation.SEARCH);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there can be a similar problem here because doc values won't include ignored values? Not sure if that matters though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoot you're right. If the keyword field is store: false, doc_values: true, we don't error, but we completely omit the value from the match_only_text field results.

I'll work on a follow-up PR to address this.

@jordan-powers jordan-powers merged commit 6f8be9c into elastic:main Jul 15, 2025
33 checks passed
jordan-powers added a commit to jordan-powers/elasticsearch that referenced this pull request Jul 15, 2025
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts
9.1

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 131314

@jordan-powers
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

jordan-powers added a commit that referenced this pull request Jul 15, 2025
…) (#131338)

(cherry picked from commit 6f8be9c)

# Conflicts:
#	server/src/main/java/org/elasticsearch/index/mapper/KeywordFieldMapper.java
if (names.length == 1) {
return storedFields.get(names[0]);
}
return Arrays.stream(names).map(storedFields::get).filter(Objects::nonNull).flatMap(List::stream).toList();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: streams are not very efficient, it's better to avoid them for code that runs per document.

jordan-powers added a commit that referenced this pull request Jul 17, 2025
In #131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
jordan-powers added a commit to jordan-powers/elasticsearch that referenced this pull request Jul 17, 2025
In elastic#131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
jordan-powers added a commit to jordan-powers/elasticsearch that referenced this pull request Jul 17, 2025
In elastic#131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
elasticsearchmachine pushed a commit that referenced this pull request Jul 17, 2025
In #131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
elasticsearchmachine pushed a commit that referenced this pull request Jul 17, 2025
In #131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >non-issue :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine v8.19.0 v9.1.0 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NPE on search query phase Cannot invoke "java.util.List.iterator()" because "values" is null

5 participants