-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable TSDB downsampling ILM configuration #130437
Comments
Pinging @elastic/platform-deployment-management (Team:Deployment Management) |
@csoulios Do we have an existing link for implementing the rollup action on the API side for policy creation? Also, we are assuming that the validation on |
@yuliacech brought up some great points based on her previous work on the previous discontinued rollup v2 project. I will add some of her questions here, which should be addressed: Dependent or related actions
@csoulios Do you know if these same concerns around removing or limiting existing actions from other phases would need to be addressed in the UI for policy configuration when adding a rollup action? On a related note, I know that we require read-only to be in place, does a read-only action need to be part of the policy phase when adding a rollup action as well? Index Management updatesAlso, @yuliacech also pointed out as highlighted in this issue from rollups v2, there may be a benefit to highlight in Index Management those indexes that are being downsampled with a rollup action in ILM. @csoulios or @wchaparro is Index Management UI in scope for this work? |
Right now we are implementing the The Rollup ILM action is coming next. I will update this issue with the PR when I submit it.
That's right. There will be a validation on the elasticsearch side that throws an error for invalid values. However, it would be more user friendly if the UI can validate the interval before sending it to ES. |
I don't think we should remove any of the ILM actions from the following phases. It doesn't make much sense to have the ReadOnly action, since the rollup index is already read-only. So, ReadOnly action will eventually do nothing
No, it doesn't have to be part of the policy. It will be an implicit step in the DownsampleAction |
I pushed the PR that implements the Downsampling ILM Action. Adding it here for reference: elastic/elasticsearch#87269 cc @ghudgins |
Pinging @elastic/kibana-app-services (Team:AppServicesUx) |
Fyi, PR elastic/elasticsearch#87269 that implements the Rollup ILM action has been merged in both |
I started working on this, made some progress, and explored the ILM code. I can answer some open questions and also have some further related questions: Open questions from this issue:Q: Is there an existing UX for defining appropriate fixed intervals?Looks like ILM currently has two different UIs for this. Seems like this is a tech debt to unify those: Another is for shard size in Shrink action: I reused the second one because the usage is very similar. Q: Needs UX design/mockupDon't think we need those because the UI pattern is established, and we already have the interval component. But need help with text:
(cc @alexfrancoeur: @vadimkibana mentioned you were going to look into the design) Q: When the policy is configured is it required for the policy to include a read-only action for the hot phase at configuration time or is the read-only requirement validated only at policy application time?As I understood from the discussion, we shouldn't do any conditions on UI based on rollup actions for readonly actions : #130437 (comment) Did I get it right? Please correct me Q: How challenging will it be to validate intervals in the UI? Need to assess options for ensuring intervals are valid based on actions in previous phases.We will add UI validation using the existing pattern. There is an example with min age where min age is validated between phases. NOTE: Currently, elasticsearch doesn't validate the interval when saving policy. It doesn't validate interval constraints between phases, and it doesn't validate if this is a valid fixed_interval expression. @csoulios, do you know if this will be improved? is there an issue? Q: Is Index Management screen updates part of this issue?
I understand this should be separate and not covered by this work. Mainly because this needs more clarifications:
If this is not acceptable, please call this out, and we should clarify what is missing |
no one asked me but here are a few comments if it helps!
Only other question from these screenshots @Dosant - is there a specific phase we need to show for the "buffer period" where we allow late arriving docs? Is that in this design yet? |
@debadair Can you please review the text that Graham suggested? |
@ghudgins I agree with your assessment, except I have questions about using the term "Rollup". We already support Rollup Jobs that create rollup indices. Does the ILM downsampling action produce the exact same type of rollup indices that are produced by Rollup Jobs? |
@ghudgins, thanks!
I am adding UI validation as described in the issue. It is just that currently, there is no validation on es side when create a policy.
Not sure what you mean by "buffer period", this wasn't mentioned in the description and I didn't see this as part of the API in the es docs that I've reviewed. This is what I saw on the subject here: https://github.com/elastic/elasticsearch-adrs/blob/master/analytics/tsdb/tsdb-rollups-design.md
As I understand they are quite different |
@cjcenizal - yes they are different. however, we will eventually deprecate the entire rollup system in lieu of ILM supported rollups....happy to keep the terminology separate if that makes it easier but they are logically the same thing and we intentionally did this version of rollups instead of the job-based one after the findings of doing v1. |
Thanks Anton and Graham. I suggest we create terms that enable users to easily differentiate the indices created by the downsample action from the indices created by rollup jobs. This is similar to the decision by the ES Data Management team to differentiate legacy index templates from composable index templates when they introduced the latter in 7.8. Here are a couple examples of what I have in mind:
Once we've landed on the right terms we can teach users about the differences between the two by using these terms consistently in our UI and docs. |
At the time the decision was to use different terminology because functionally neither one is a superset of the other. I'd rather not change that decision at this stage unless someone feels strongly about it. |
I retract my suggestion 😄 and will edit the above comment! |
In general, we try to mirror the terminology used in the API, but the potential confusion with previous rollups is definitely a concern. With some wordsmithing, I think we can bridge the terminology and dodge the potential confusion. This is more words than the other action descriptions, but it might be worth it:
That gives the Rollup interval label for the setting some context, and connects the dots between downsampling and rollups. One thing to note: We don't (yet) have a good destination for a Learn more link. I'll work with the folks on the ES side to fix that. |
Ι will try to address as many open questions as possible
No, it is not required for the policy to include a read-only action. The index is implicitly set in read-only mode by the rollup action.
Buffer period is the period after the index has been rolled over and before it is downsampled. For example, if we rollover an index after 1 day and we downsample it after 3 days, the buffer period (when it can accept late arrivals) is 2 days. I don't think there is any way we should enforce any validation on this as this is totally up to the user to time the transitions of the index.
The format of indices produced by ILM downsampling is totally different from the indices produced from rollup jobs.
Generally speaking, in the time-series data world downsampling is a subset of the rollup functionality (summarize data only by changing the time interval). Current release of rollups will only support the downsampling functionality. However, later we may choose to add support for more rollup features (such as dimension reduction). If we now name this feature "downsampling", I am afraid that later we will have to rename it so that it correctly describes the supported functionality (or keep the name "downsampling" but adding more capabilities to it) The experimental rollup functionality (Rollup Jobs) is soon going to be replaced by the new downsampling/rollup feature. I would say that we have to be explicit that this is the "new rollups". So, if I had to pick a name of it, this would be "Rollups for time series data/indices" |
Continue working on this in: #138748
This is how the labels look now, looking for feedback: Does the interval label make sense (Downsampling interval)? Or should it be named somehow differently?
|
TSDB Downsampling ILM Configuration
Stakeholders
Purpose of project and known requirements
As part of the TSDB project, we want to enable automatic downsampling of time series data via ILM. Downsampling will be provided as an ILM action. Downsampling configuration will be simple by extracting dimensions and metrics from the index mapping. The only information that will be required from the user is the time interval. This project seeks to modify the existing ILM UX to enable configuring and modifying the downsampling rollup action for read-only time series indices in the hot, warm, and cold phases.
Resources
Tasks
Technical analysis
Data flow
As described in the ILM configuration section of the downsampling design, we will add the ability to configure an ILM rollup action on the hot, warm, and cold phases in ILM.
As part of the rollup action, a fixed interval must be configured for the rollup, which is the interval to which the data will be rolled up. These must use the same notation as the date_histogram aggregation
For example:
Rollup of rollups
As described in the downsampling design doc section for rollup of rollups, the rollup action intervals must adhere to some limitations: they must be greater than any previous rollup action and must be a multiple of that interval.
So for example, on the warm phase the user includes a 3 hour interval, if they choose to add a rollup action on the cold phase the interval must be a multiple of 3h. It cannot be 5h for example.
Another example: a user can have a rollup on the hot phase of 2d and then no rollup action on the warm phase, and then a rollup action on the cold phase, but it must be a multiple of 2d, for example it cannot be 1d nor can it be 3d.
Note that if this is too expensive in the UI, this can also be validated on policy creation time, but ideally this would be supported in the ILM configuration UI.
Overview of changes
Note that during modification the fixed interval can be changed to any value and does not need to be validated or limited by existing fixed interval on the policy.
Open questions
Is there an existing UX for defining appropriate fixed intervals?Enable TSDB downsampling ILM configuration #130437 (comment)Needs UX design/mockupEnable TSDB downsampling ILM configuration #130437 (comment)How challenging will it be to validate intervals in the UI? Need to assess options for ensuring intervals are valid based on actions in previous phases.Enable TSDB downsampling ILM configuration #130437 (comment)The text was updated successfully, but these errors were encountered: