-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose timestamp field type on coordinator node #65873
Expose timestamp field type on coordinator node #65873
Conversation
Today a coordinating node does not have (easy) access to the mappings for the indices for the searches it wishes to coordinate. This means it can't properly interpret a timestamp range filter in a query and must involve a copy of every shard in at least the `can_match` phase. It therefore cannot cope with cases when shards are temporarily not started even if those shards are irrelevant to the search. This commit captures the mapping of the `@timestamp` field for indices which expose a timestamp range in their index metadata.
Pinging @elastic/es-distributed (Team:Distributed) |
@Nullable | ||
public DateFieldMapper.DateFieldType getTimestampFieldType(Index index) { | ||
final PlainActionFuture<DateFieldMapper.DateFieldType> future = fieldTypesByIndex.get(index); | ||
if (future == null || future.isDone() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There remains a question of whether we should block here or not (and if so, for how long).
On reflection I think we shouldn't block. Returning null
sooner will allow the search coordination to proceed normally, ignoring any timestamp filter and deferring any skipping to the individual shards. This means we'll see shard failures if the coordinating node falls behind on extracting these mappings AND some of the shards are unassigned, which is hopefully rare.
As a follow-up we could in theory add another more patient getter to support a workflow that goes:
- we call
getTimestampFieldType
which returnsnull
- some shards are unavailable for the
can_match
phase - we call
getTimestampFieldTypePatiently
to see for whether those shard failures can be ignored or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to return null
and proceed regularly if the mapping isn't available yet. This should be rare enough to cause too much trouble.
I guess the most problematic scenario is when a node joins and has to parse a lot of mappings, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, although in that case there's no particular reason to expect shards to be unavailable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some minor comments in the test 👍
...rozen-indices/src/internalClusterTest/java/org/elasticsearch/index/engine/FrozenIndexIT.java
Show resolved
Hide resolved
timestampFieldTypeFuture.onResponse(timestampFieldType); | ||
}); | ||
assertTrue(timestampFieldTypeFuture.isDone()); | ||
assertThat(timestampFieldTypeFuture.get().dateTimeFormatter().locale().toString(), equalTo(locale)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add a an assertion that checks that DateFieldMapper.DateFieldType#parse
works with the original timestamp string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok let me try and remember the month names in French to give this assertion some teeth 🇫🇷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 79ef7b0. Remembering the names wasn't the hard bit, it was working out that in French we write month names lower-case, with a trailing .
, and sometimes use more than 3 letters.
@Nullable | ||
public DateFieldMapper.DateFieldType getTimestampFieldType(Index index) { | ||
final PlainActionFuture<DateFieldMapper.DateFieldType> future = fieldTypesByIndex.get(index); | ||
if (future == null || future.isDone() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to return null
and proceed regularly if the mapping isn't available yet. This should be rare enough to cause too much trouble.
I guess the most problematic scenario is when a node joins and has to parse a lot of mappings, right?
@elasticmachine please run elasticsearch-ci/bwc |
@elasticmachine please run elasticsearch-ci/2 |
Today a coordinating node does not have (easy) access to the mappings for the indices for the searches it wishes to coordinate. This means it can't properly interpret a timestamp range filter in a query and must involve a copy of every shard in at least the `can_match` phase. It therefore cannot cope with cases when shards are temporarily not started even if those shards are irrelevant to the search. This commit captures the mapping of the `@timestamp` field for indices which expose a timestamp range in their index metadata.
The backport failed in CI: https://gradle-enterprise.elastic.co/s/s3kleye5lwxds/console-log?task=:x-pack:plugin:frozen-indices:internalClusterTest No idea why yet but I've reverted it from |
This reverts commit a0e5f9b.
Today a coordinating node does not have (easy) access to the mappings for the indices for the searches it wishes to coordinate. This means it can't properly interpret a timestamp range filter in a query and must involve a copy of every shard in at least the `can_match` phase. It therefore cannot cope with cases when shards are temporarily not started even if those shards are irrelevant to the search. This commit captures the mapping of the `@timestamp` field for indices which expose a timestamp range in their index metadata.
Today a coordinating node does not have (easy) access to the mappings for the indices for the searches it wishes to coordinate. This means it can't properly interpret a timestamp range filter in a query and must involve a copy of every shard in at least the `can_match` phase. It therefore cannot cope with cases when shards are temporarily not started even if those shards are irrelevant to the search. This commit captures the mapping of the `@timestamp` field for indices which expose a timestamp range in their index metadata. Backport of #65873 to 7.x
I reinstated the backport with a few JDK8-specific tweaks in #65925. |
Today a coordinating node does not have (easy) access to the mappings
for the indices for the searches it wishes to coordinate. This means it
can't properly interpret a timestamp range filter in a query and must
involve a copy of every shard in at least the
can_match
phase. Ittherefore cannot cope with cases when shards are temporarily not started
even if those shards are irrelevant to the search.
This commit captures the mapping of the
@timestamp
field for indiceswhich expose a timestamp range in their index metadata.