Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSDB: Initial reindex fix #86647

Merged
merged 5 commits into from
May 11, 2022
Merged

TSDB: Initial reindex fix #86647

merged 5 commits into from
May 11, 2022

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented May 10, 2022

This teaches reindex the smallest thing that it needs to know about
tsdb: _id is automatically generated. Armed with that knowledge
reindex now doesn't attempt to copy the _id when writing to a tsdb
index.

Important: If the index doesn't yet exist it will assume that the
index will be created in standard mode. We can detect what mode it
should be created with in a follow up change.

It turns out that there is a fairly simple recipe for reindexing to a
`time_series` index:
1. If you are reindexing from a time series index to a time series index
   and *not* changing the `@timestamp` or dimensions it "just
   works"(TM).
2. If you are reindexing from a standard index with a standard random
   `_id` you should clear it on reindex.
3. If you are reindexing from tsdb index to a tsdb index and modifying a
   dimension or `@timestamp` then you should clear the `_id`.

This is not pleasant to have to remember. But it doesn't crash!
@nik9000 nik9000 added >feature :Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down :StorageEngine/TSDB You know, for Metrics v8.3.0 labels May 10, 2022
@elasticmachine elasticmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels May 10, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine
Copy link
Collaborator

Hi @nik9000, I've created a changelog YAML for you.

@sethmlarson sethmlarson added the Team:Clients Meta label for clients team label May 10, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/clients-team (Team:Clients)

nik9000 added 2 commits May 10, 2022 17:14
This teaches reindex the smallest thing that it needs to know about
tsdb: `_id` is automatically generated. Armed with that knowledge
reindex now doesn't attempt to copy the `_id` when writing to a tsdb
index.

Important: If the index doesn't yet exist it will *assume* that the
index will be created in `standard` mode. We can detect what mode it
*should* be created with in a follow up change.
@nik9000 nik9000 removed the :Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down label May 10, 2022
@elasticmachine elasticmachine removed the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 10, 2022
@nik9000 nik9000 removed the Team:Clients Meta label for clients team label May 10, 2022
@sethmlarson sethmlarson added the Team:Clients Meta label for clients team label May 10, 2022
Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -285,7 +300,7 @@ protected RequestWrapper<IndexRequest> buildRequest(ScrollableHitSource.Hit doc)
}

// id and source always come from the found doc. Scripts can change them but they operate on the index request.
index.id(doc.getId());
index.id(destinationIndexIdMapper.reindexId(doc.getId()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will probably also speed reindex up

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. It's a noop for standard indices. For tsdb indices it saves us some bits on the wire and a comparison. We still have to regenerate the id.

@@ -0,0 +1,5 @@
pr: 86647
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be mentioned here? I don't think this changes anything with the tsdb feature flag in place? If so then we should label this PR also as non-issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silly generated thing....

@nik9000
Copy link
Member Author

nik9000 commented May 11, 2022

run elasticsearch-ci/packaging-tests-windows-sample

@nik9000 nik9000 merged commit 48a601b into elastic:master May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Clients Meta label for clients team v8.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants