Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSDB: Downsampling copies all settings from source to rollup index #88565

Merged
merged 5 commits into from
Jul 25, 2022

Conversation

csoulios
Copy link
Contributor

This PR modified the rollup action so that all index settings are copied from the source index to the rollup index.

Relates to #85708

@csoulios csoulios added >non-issue :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data :StorageEngine/TSDB You know, for Metrics v8.4.0 labels Jul 15, 2022
@csoulios csoulios requested a review from dakrone July 15, 2022 11:41
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 15, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I left a couple of comments

Comment on lines +466 to +467
targetSettings.put(IndexMetadata.INDEX_ROLLUP_SOURCE_NAME.getKey(), sourceIndex.getName())
.put(IndexMetadata.INDEX_ROLLUP_SOURCE_UUID.getKey(), sourceIndex.getUUID());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we already have a test for doing a rollup-of-a-rollup where the original index is maintained in the rollup source/uuid? If not, could you add one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now rollup-of-rollups does not fully work. I plan to add support for rollup-of-rollup in a following PR.
I will definitely add tests in that PR

This part is only plumbing that is going to be needed for rollups of rollups

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now rollup-of-rollups does not fully work.

As some subsequent work (not in this PR), does it make sense to reject the request if someone tries to rollup an index that's already been rolled up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In data streams this cannot happen because no two indices for the same period can co-exist.

When it comes to concrete indices, there are very valid use cases for allowing multiple rollup indices of the same source index. For example having multiple rollup intervals (hourlies, dailies etc) or having multiple timezones (right now we support only UTC, but tin the future we should support more timezones).

Comment on lines +250 to +253
// Number of replicas had been previously set to 0 to speed up index population
if (sourceIndexMetadata.getNumberOfReplicas() > 0) {
settings.put(IndexMetadata.SETTING_NUMBER_OF_REPLICAS, sourceIndexMetadata.getNumberOfReplicas());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this piece, why do we copy replicas only if they are greater than 0? How would this behave if someone had something like index.auto_expand_replicas: 0-2 where the number of replicas is variable? Shouldn't we be consistent and maintain the number of replicas as defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind this is that when we create the rollup index we explicitly set that num_of_replicas=0. We do so, because we don't want the index to replicate while we populate the index.

When all documents have been written in the index, we set the num_of_replicas equal to the num_of_replicas of the source index. Of course, if num_of_replicas of the source index is 0, we don't have to do anything, hence the conditional.

@csoulios csoulios requested a review from dakrone July 21, 2022 18:53
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Comment on lines +466 to +467
targetSettings.put(IndexMetadata.INDEX_ROLLUP_SOURCE_NAME.getKey(), sourceIndex.getName())
.put(IndexMetadata.INDEX_ROLLUP_SOURCE_UUID.getKey(), sourceIndex.getUUID());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now rollup-of-rollups does not fully work.

As some subsequent work (not in this PR), does it make sense to reject the request if someone tries to rollup an index that's already been rolled up?

@elasticsearchmachine elasticsearchmachine changed the base branch from master to main July 22, 2022 23:04
@csoulios
Copy link
Contributor Author

Thanks a lot for the review!

@csoulios csoulios merged commit b3cc4ef into elastic:main Jul 25, 2022
@csoulios csoulios deleted the rollup-copy-settings branch July 25, 2022 09:28
csoulios added a commit to csoulios/elasticsearch that referenced this pull request Jul 25, 2022
After merging PR elastic#88565, that delegates this task to the actual
rollup step
csoulios added a commit that referenced this pull request Jul 26, 2022
…#88812)

PR #88565 modified the rollup action so that it copies all index settings from the source 
to the rollup index. However, the rollup action expilitly requires that the source index has
 the index.block.write: true.

Since the rollup index must be writeable when it is created, this PR excludes the index.block.write i
ndex setting when creating the rollup index.

The index.block.write index setting is explictly set to true at the end of the rollup action (step 4).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants