Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing users to specify a delete policy based on amount of storage used within the ES cluster #564

Conversation

ewolinetz
Copy link
Contributor

Currently there is a gap in that the only way to specify a delete policy is based on the age of the logs which can make it difficult for capacity planning and cost for customers when they need to use more storage than they anticipated. This enhancement seeks to outline a mechanism to let them specify their retention based on a disk usage within the cluster instead.

@ewolinetz
Copy link
Contributor Author

/cc @alanconway @jcantrill

//
MinAge TimeUnit `json:"minAge, optional"`

// The percent disk usage to wait for before deleting indices for this alias (e.g. 75)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is more convenient here - percentage or absolute size? I'm not sure of the answer.
Could support both by treating "75" or "75%" as a percent, and "75G" as a size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that the request was for a percentage and I think that would actually make it more simple for users. If we wanted to support a static size I would suggest we have a field that is specific for that -- like a k8s resource request

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this more explicit and telling what it represents MaxUsagePercent or so that it is in agreement with age should it be MinUsagePercent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is "PercentUsage" be explicit and clear enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel theres a need to say "Min" or "Max"... we have a way for the user to tell us this is how much percent usage they want to allow. So more than that and we curate, less than that and we don't trigger.

//
MinAge TimeUnit `json:"minAge, optional"`

// The percent disk usage to wait for before deleting indices for this alias (e.g. 75)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this more explicit and telling what it represents MaxUsagePercent or so that it is in agreement with age should it be MinUsagePercent?


Currently this is not possible in the OpenDistro Index Management plugin and causes us to deviate from what they offer. We would need to ensure that it is added by us prior to us moving to ES7 (and adopting their plugins).

The naming convention "MinAge" when paired with "Usage" could give a false impression -- we may end up deleting indices before they reach the "MinAge" because they are the oldest indices and with "Usage" being reached. This may just need to be framed differently within the docs and field descriptions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can frame this in the same context as the ES docs for the criteria by which they rollover an index: its when any of the 3 possible conditions is met

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's what I convey in ##Design Details and more calling out here that we should make sure it is clear in the docs.

@jcantrill
Copy link
Contributor

/approve
/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2021
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcantrill

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 14, 2021
@ewolinetz ewolinetz force-pushed the elasticsearch-percentage-based-index-management branch from 7e27b7d to 7011faa Compare January 18, 2021 23:57
@jcantrill
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 16, 2021
@ewolinetz
Copy link
Contributor Author

/refresh

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 15, 2021
@ewolinetz
Copy link
Contributor Author

/lifecycle frozen

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 15, 2021

@ewolinetz: The lifecycle/frozen label cannot be applied to Pull Requests.

In response to this:

/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ewolinetz
Copy link
Contributor Author

/refresh

@ewolinetz
Copy link
Contributor Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 15, 2021
@ewolinetz
Copy link
Contributor Author

/test markdownlint

@ewolinetz ewolinetz force-pushed the elasticsearch-percentage-based-index-management branch from 7011faa to 23e0186 Compare June 17, 2021 18:50
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 17, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 17, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcantrill

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

… delete policy based on amount of storage used within the cluster
@ewolinetz ewolinetz force-pushed the elasticsearch-percentage-based-index-management branch from 23e0186 to 9ada3a5 Compare June 17, 2021 19:01
@ewolinetz
Copy link
Contributor Author

/refresh

Copy link
Contributor

@jcantrill jcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 18, 2021
@openshift-merge-robot openshift-merge-robot merged commit f2f71a2 into openshift:master Jun 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants