Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STOR-2040: CLI command to display bound pvc filesystem usage percentage #1854

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gmeghnag
Copy link

The code makes use of a Prometheus query to implement the oc adm top persistentvolumeclaims and show usage statistics for bound persistentvolumeclaims, as follows:

oc adm top pvc --insecure-skip-tls-verify=true -n reproducer-pvc
NAMESPACE      NAME               USAGE(%) 
reproducer-pvc pvc-reproducer-pvc 98.28    
reproducer-pvc pvc-test-pvc       14.56    

It supports the following flags:

    -A, --all-namespaces=false:
	If present, list the pvc usage across all namespaces. Namespace in current context is ignored even if
	specified with --namespace

    --insecure-skip-tls-verify=false:
	If true, the server's certificate will not be checked for validity. This will make your HTTPS connections
	insecure

@openshift-ci-robot
Copy link

openshift-ci-robot commented Aug 22, 2024

@gmeghnag: This pull request references STOR-2040 which is a valid jira issue.

In response to this:

The code makes use of a Prometheus query to implement the oc adm top persistentvolumeclaims and show usage statistics for bound persistentvolumeclaims, as follows:

oc adm top pvc --insecure-skip-tls-verify=true -n reproducer-pvc
NAMESPACE      NAME               USAGE(%) 
reproducer-pvc pvc-reproducer-pvc 98.28    
reproducer-pvc pvc-test-pvc       14.56    

It supports the following flags:

   -A, --all-namespaces=false:
  If present, list the pvc usage across all namespaces. Namespace in current context is ignored even if
  specified with --namespace

   --insecure-skip-tls-verify=false:
  If true, the server's certificate will not be checked for validity. This will make your HTTPS connections
  insecure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 22, 2024
Copy link
Contributor

openshift-ci bot commented Aug 22, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gmeghnag
Once this PR has been reviewed and has the lgtm label, please assign ardaguclu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

openshift-ci bot commented Aug 23, 2024

@gmeghnag: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ardaguclu
Copy link
Member

Thank you for spending time on this nice and useful feature. But that requires a KEP first to sync on the feature and implementation.

@gmeghnag
Copy link
Author

Hey @ardaguclu, what is a KEP? And how to create one? Thanks :)

@ardaguclu
Copy link
Member

Hey @ardaguclu, what is a KEP? And how to create one? Thanks :)

I think, you'd write an enhancement proposal under https://github.com/openshift/enhancements/tree/master/enhancements/oc and we'll discuss the design and align on it. Once it merges, we can return to this PR.

Copy link
Member

@dobsonj dobsonj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @gmeghnag :) This is nice work, I just have a few suggestions.

Comment on lines +203 to +207
urlParams := url.Values{}
urlParams.Set("query", query)
uri.RawQuery = urlParams.Encode()

persistentVolumeClaimsBytes, err := getWithBearer(ctx, getRoute, "openshift-monitoring", "prometheus-k8s", uri, bearerToken, insecureTLS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get this error when using system:admin KUBECONFIG, but not for the other oc adm top subcommands:

$ ./oc adm top pvc
error: failed to get persistentvolumeclaims from Prometheus: no token is currently in use for this session

Is it possible to avoid this error by using the metrics API interface instead of constructing a raw request w/ bearer token?

func getMetricsFromMetricsAPI(metricsClient metricsclientset.Interface, namespace, resourceName string, allNamespaces bool, labelSelector labels.Selector, fieldSelector fields.Selector) (*metricsapi.PodMetricsList, error) {
var err error
ns := metav1.NamespaceAll
if !allNamespaces {
ns = namespace
}
versionedMetrics := &metricsv1beta1api.PodMetricsList{}
if resourceName != "" {
m, err := metricsClient.MetricsV1beta1().PodMetricses(ns).Get(context.TODO(), resourceName, metav1.GetOptions{})
if err != nil {
return nil, err
}
versionedMetrics.Items = []metricsv1beta1api.PodMetrics{*m}
} else {
versionedMetrics, err = metricsClient.MetricsV1beta1().PodMetricses(ns).List(context.TODO(), metav1.ListOptions{LabelSelector: labelSelector.String(), FieldSelector: fieldSelector.String()})
if err != nil {
return nil, err
}
}
metrics := &metricsapi.PodMetricsList{}
err = metricsv1beta1api.Convert_v1beta1_PodMetricsList_To_metrics_PodMetricsList(versionedMetrics, metrics, nil)
if err != nil {
return nil, err
}
return metrics, nil
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU the metricsclientset API allows consumers to access only resource metrics (CPU and memory) for pods and nodes.

So if we strictly want system:admin users authenticated using the KUBECONFIG file to use that command we should evaluate:

  • getting a token from a different place, maybe a secret containing it?! (I will try to check if any exists)
  • using a different workflow to obtain this information rather than Prometheus API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found out that the token from the secret localhost-recovery-client-token in openshift-kube-apiserver namespace is a valid one to use:

oc get secret -n openshift-kube-apiserver localhost-recovery-client-token -o json | jq '.data.token' -r | base64 -d

}

}
headers := []string{"NAMESPACE", "NAME", "USAGE(%)"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some additional fields? My suggestion would be NAMESPACE, NAME, VOLUME, CAPACITY, USED, USAGE(%).

VOLUME: name of the PV attached to the PVC so the user can easily see which volume it corresponds to.
CAPACITY: total capacity of the volume in multiple of bytes (same as oc get pv, for example 10Gi or 2Ti).
USED: total used space of the volume in multiple of bytes (same format as capacity bytes).

This would be useful to mention in the enhancement that @ardaguclu requested, in case anyone has other suggestions (or objections).

Comment on lines +158 to +172
// if more pvc are requested as args but one of them does not not exist
if len(args) != 0 && len(promOutput.Data.Result) != len(args) {
resultingPvc := make(map[string]bool)
for _, _promOutputDataResult := range promOutput.Data.Result {
pvcName, _ := _promOutputDataResult.Metric["persistentvolumeclaim"]
resultingPvc[pvcName] = true
}
for _, arg := range args {
_, pvcPresentInResult := resultingPvc[arg]
if !pvcPresentInResult {
return fmt.Errorf("persistentvolumeclaim %q not found in %s namespace.", arg, o.Namespace)
}
}

}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will immediately return an error if one PVC is not found. Could it instead return the valid ones + errors for the ones that do not exist?

For example:

$ oc get pvc -n testns claim1 foo bar
NAME     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
claim1   Bound    pvc-ecf81046-51c4-4fa2-ada3-7797faa670b4   1Gi        RWO            gp3-csi        <unset>                 5h14m
Error from server (NotFound): persistentvolumeclaims "foo" not found
Error from server (NotFound): persistentvolumeclaims "bar" not found

You could add resultingPvc[pvcName] = true to the promOutput.Data.Result loop below so you only have to loop through it once, then check for missing PVC's to print error messages at the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants