-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Migrated metrics to the connection record metrics client side #77
Conversation
connection/connection.go
Outdated
migrated = additionalInfo.(AdditionalInfo).Migrated | ||
} | ||
cmmv, metricsErr := cmm.WithLabelValues(map[string]string{metrics.LabelMigrated: migrated}) | ||
if metricsErr != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the sidecar didn't define the "migrated" label? This will cause WithLabelValues
to fail, right?
I think it would be better to let sidecars opt into this feature. This means only calling WithLabelValues
if there is some non-empty value to be recorded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the sidecar did not define the migrated label, metricsErr
will not be nil and we will fall back to use the old cmm to record metrics. So the old sidecar metrics will still be the same. See comments on line 222
connection/connection.go
Outdated
} | ||
cmmv, metricsErr := cmm.WithLabelValues(map[string]string{metrics.LabelMigrated: migrated}) | ||
if metricsErr != nil { | ||
klog.V(5).Infof("Failed to record migrated status, error: %v", metricsErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this should cause the wrapper to fail. A log message with V(5)
is likely to go unnoticed and this shouldn't happen in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I was adding comments to the wrong section. Now this should make sense. We have two cases here:
- The metricsManager does not define labelMigration, we will fail the
WithLabelValues
with metricsErr and so we fall back to use the original metric. I dont know in this condition if we should log something... This can happen when a sidecar upgrade to the newest csi-lib-util version and did not change there MetricsManager initialization. What's your thought? - The metricsManager has the labelMigration, the metricsErr will be nil, we use the wrapper to record the metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused by this fallback behavior. Shouldn't we define "migrated" in the metric schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused by this fallback behavior. Shouldn't we define "migrated" in the metric schema?
That would also work... But that means we will have the migrated field in the MetricsManager by default and everything that is using it will have this new field, which I do not know if it is something we want... This current way of using NewCSIMetricsManagerWithOptions(driverName, WithLabelNames(LabelMigrated))
is more like an opt-in way. So a sidecar/plugin can decide if they want this field or not...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I change the HaveAdditionalLabel to public so hopefully the logic makes more sense here.
So basically we will check if the metricsManager has the migrated label. If it does, it will record the migrated status, otherwise it will just use the old metrics.
metrics/metrics.go
Outdated
// | ||
// driverName - Name of the CSI driver against which this operation was executed. | ||
// If unknown, leave empty, and use SetDriverName method to update later. | ||
func NewCSIMetricsManagerForSidecarWithMigrationStatus(driverName string) CSIMetricsManager { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this helper function. Just document how to enable this feature and then let sidecars call NewCSIMetricsManagerWithOptions(driverName, WithLabelNames(LabelMigrated))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing. I will remove it.
fd0db71
to
a5bd26e
Compare
connection/connection.go
Outdated
} | ||
cmmv, metricsErr := cmm.WithLabelValues(map[string]string{metrics.LabelMigrated: migrated}) | ||
if metricsErr != nil { | ||
klog.V(5).Infof("Failed to record migrated status, error: %v", metricsErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused by this fallback behavior. Shouldn't we define "migrated" in the metric schema?
/assign @mattcary |
0b128cc
to
5083a1e
Compare
Is NewCSIMetricsManagerForSidecarWithMigrationStatus defined? |
I originally had it, and it seems @pohly think it is unnecessary. I think we can just add it here and it does not hurt if people are not using it. |
5083a1e
to
25e34d0
Compare
Done |
My concern is that this will get out of hand. What's next,
How about we define what those default options are and then a sidecar that wants to the default options plus something else can use that slice plus the additional ones when calling |
Could a builder work here? Something like 'SidecarMetrics().WithMigration()' |
25e34d0
to
ce80bea
Compare
I create a |
Here's a quick draft what that would look like:
And then in a sidecar which would have used
|
I am wondering if we should put migration status into the default option for sidecar. |
I'm not sure. If a sidecar never modifies the context, then it also won't need the metric pre-defined. |
If it does not modify context it will get "false" as the value. The reason why I think this can be the default is that we might not want to have different schema for sidecars. It will be hard for SRE to consume. So for the sidecars that does not have migration like snapshotter, we just default the value to false. What do you think? |
Or I think we can leave the |
In terms of backwards compatibility, it is fine for a metric to have less fields than the metric schema defined in the collector. So if some sidecars don't have all the fields, or if some plugins don't support migration, it should be ok. What will fail is if the actual metric has more fields than what is defined in the collector. |
In that case, I think the current PR should be good. Unless there are any other default fields we need to add to the sidecar, there is no need to add the DefaultSidecarOptions() for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
connection/connection.go
Outdated
if metricsErr != nil { | ||
klog.Errorf("Failed to record migrated status, error: %v", metricsErr) | ||
} else { | ||
cmmv.RecordMetrics( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we save a few lines and have both paths share this call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to but the caller is different... The cmmv is a different thing with cmm. Do you have any suggestion on how can I combine these two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating a new cmmv, can you set the return value to cmm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Jiawei0227, msau42 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ce80bea
to
996f59d
Compare
/retest |
2 similar comments
/retest |
/retest |
/lgtm |
4aff857 Merge pull request kubernetes-csi#109 from pohly/alpha-test-defaults 0427289 Merge pull request kubernetes-csi#110 from pohly/kind-0.9-bazel-build-workaround 9a370ab prow.sh: work around "kind build node-image" failure 522361e prow.sh: only run alpha tests for latest Kubernetes release 22c0395 Merge pull request kubernetes-csi#108 from bnrjee/master b5b447b Add go ldflags using LDFLAGS at the time of compilation 16f4afb Merge pull request kubernetes-csi#107 from pohly/kind-update 7bcee13 prow.sh: update to kind 0.9, support Kubernetes 1.19 df518fb prow.sh: usage of Bazel optional c3afd42 Merge pull request kubernetes-csi#104 from xing-yang/snapshot dde93b2 Update to snapshot-controller v3.0.0 a0f195c Merge pull request kubernetes-csi#106 from msau42/fix-canary 7100c12 Only set staging registry when running canary job b3c65f9 Merge pull request kubernetes-csi#99 from msau42/add-release-process e53f3e8 Merge pull request kubernetes-csi#103 from msau42/fix-canary d129462 Document new method for adding CI jobs are new K8s versions e73c2ce Use staging registry for canary tests 2c09846 Add cleanup instructions to release-notes generation 60e1cd3 Merge pull request kubernetes-csi#98 from pohly/kubernetes-1-19-fixes 0979c09 prow.sh: fix E2E suite for Kubernetes >= 1.18 3b4a2f1 prow.sh: fix installing Go for Kubernetes 1.19.0 1fbb636 Merge pull request kubernetes-csi#97 from pohly/go-1.15 82d108a switch to Go 1.15 d8a2530 Merge pull request kubernetes-csi#95 from msau42/add-release-process 843bddc Add steps on promoting release images 0345a83 Merge pull request kubernetes-csi#94 from linux-on-ibm-z/bump-timeout 1fdf2d5 cloud build: bump timeout in Prow job 41ec6d1 Merge pull request kubernetes-csi#93 from animeshk08/patch-1 5a54e67 filter-junit: Fix gofmt error 0676fcb Merge pull request kubernetes-csi#92 from animeshk08/patch-1 36ea4ff filter-junit: Fix golint error f5a4203 Merge pull request kubernetes-csi#91 from cyb70289/arm64 43e50d6 prow.sh: enable building arm64 image 0d5bd84 Merge pull request kubernetes-csi#90 from pohly/k8s-staging-sig-storage 3df86b7 cloud build: k8s-staging-sig-storage c5fd961 Merge pull request kubernetes-csi#89 from pohly/cloud-build-binfmt db0c2a7 cloud build: initialize support for running commands in Dockerfile be902f4 Merge pull request kubernetes-csi#88 from pohly/multiarch-windows-fix 340e082 build.make: optional inclusion of Windows in multiarch images 5231f05 build.make: properly declare push-multiarch 4569f27 build.make: fix push-multiarch ambiguity 17dde9e Merge pull request kubernetes-csi#87 from pohly/cloud-build bd41690 cloud build: initial set of shared files 9084fec Merge pull request kubernetes-csi#81 from msau42/add-release-process 6f2322e Update patch release notes generation command 0fcc3b1 Merge pull request kubernetes-csi#78 from ggriffiths/fix_csi_snapshotter_rbac_version_set d8c76fe Support local snapshot RBAC for pull jobs c1bdf5b Merge pull request kubernetes-csi#80 from msau42/add-release-process ea1f94a update release tools instructions 152396e Merge pull request kubernetes-csi#77 from ggriffiths/snapshotter201_update 7edc146 Update snapshotter to version 2.0.1 4cf843f Merge pull request kubernetes-csi#76 from pohly/build-targets 3863a0f build for multiple platforms only in CI, add s390x 8322a7d Merge pull request kubernetes-csi#72 from pohly/hostpath-update 7c5a89c prow.sh: use 1.3.0 hostpath driver for testing b8587b2 Merge pull request kubernetes-csi#71 from wozniakjan/test-vet fdb3218 Change 'make test-vet' to call 'go vet' d717c8c Merge pull request kubernetes-csi#69 from pohly/test-driver-config a1432bc Merge pull request kubernetes-csi#70 from pohly/kubelet-feature-gates 5f74333 prow.sh: also configure feature gates for kubelet 84f78b1 prow.sh: generic driver installation 3c34b4f Merge pull request kubernetes-csi#67 from windayski/fix-link fa90abd fix incorrect link ff3cc3f Merge pull request kubernetes-csi#54 from msau42/add-release-process ac8a021 Document the process for releasing a new sidecar 23be652 Merge pull request kubernetes-csi#65 from msau42/update-hostpath 6582f2f Update hostpath driver version to get fix for connection-timeout 4cc9174 Merge pull request kubernetes-csi#64 from ggriffiths/snapshotter_2_version_update 8191eab Update snapshotter to version v2.0.0 3c463fb Merge pull request kubernetes-csi#61 from msau42/enable-snapshots 8b0316c Fix overriding of junit results by using unique names for each e2e run 5f444b8 Merge pull request kubernetes-csi#60 from saad-ali/updateHostpathVersion af9549b Update prow hostpath driver version to 1.3.0-rc2 f6c74b3 Merge pull request kubernetes-csi#57 from ggriffiths/version_gt_kubernetes_fix fc80975 Fix version_gt to work with kubernetes prefix 9f1f3dd Merge pull request kubernetes-csi#56 from msau42/enable-snapshots b98b2ae Enable snapshot tests in 1.17 to be run in non-alpha jobs. 9ace020 Merge pull request kubernetes-csi#52 from msau42/update-readme 540599b Merge pull request kubernetes-csi#53 from msau42/fix-make a4e6299 fix syntax for ppc64le build 771ca6f Merge pull request kubernetes-csi#49 from ggriffiths/prowsh_improve_version_gt d7c69d2 Merge pull request kubernetes-csi#51 from msau42/enable-multinode 4ad6949 Improve snapshot pod running checks and improve version_gt 53888ae Improve README by adding an explicit Kubernetes dependency section 9a7a685 Create a kind cluster with two worker nodes so that the topology feature can be tested. Test cases that test accessing volumes from multiple nodes need to be skipped 4ff2f5f Merge pull request kubernetes-csi#50 from darkowlzz/kind-0.6.0 80bba1f Use kind v0.6.0 6d674a7 Merge pull request kubernetes-csi#47 from Pensu/multi-arch 8adde49 Merge pull request kubernetes-csi#45 from ggriffiths/snapshot_beta_crds 003c14b Add snapshotter CRDs after cluster setup a41f386 Merge pull request kubernetes-csi#46 from mucahitkurt/kind-cluster-cleanup 1eaaaa1 Delete kind cluster after tests run. 83a4ef1 Adding build for ppc64le 4fcafec Merge pull request kubernetes-csi#43 from pohly/system-pod-logging f41c135 prow.sh: also log output of system containers ee22a9c Merge pull request kubernetes-csi#42 from pohly/use-vendor-dir 8067845 travis.yml: also use vendor directory 23df4ae prow.sh: use vendor directory if available a53bd4c Merge pull request kubernetes-csi#41 from pohly/go-version c8a1c4a better handling of Go version 5e773d2 update CI to use Go 1.13.3 f419d74 Merge pull request kubernetes-csi#40 from msau42/add-1.16 e0fde8c Add new variables for 1.16 and remove 1.13 adf00fe Merge pull request kubernetes-csi#36 from msau42/full-clone f1697d2 Do full git clones in travis. Shallow clones are causing test-subtree errors when the depth is exactly 50. 2c81919 Merge pull request kubernetes-csi#34 from pohly/go-mod-tidy 518d6af Merge pull request kubernetes-csi#35 from ddebroy/winbld2 2d6b3ce Build Windows only for amd64 c1078a6 go-get-kubernetes.sh: automate Kubernetes dependency handling 194289a update Go mod support 0affdf9 Merge pull request kubernetes-csi#33 from gnufied/enable-hostpath-expansion 6208f6a Enable hostpath expansion 6ecaa76 Merge pull request kubernetes-csi#30 from msau42/fix-windows ea2f1b5 build windows binaries with .exe suffix 2d33550 Merge pull request kubernetes-csi#29 from mucahitkurt/create-2-node-kind-cluster a8ea8bc create 2-node kind cluster since topology support is added to hostpath driver df8530d Merge pull request kubernetes-csi#27 from pohly/dep-vendor-check 35ceaed prow.sh: install dep if needed f85ab5a Merge pull request kubernetes-csi#26 from ddebroy/windows1 9fba09b Add rule for building Windows binaries 0400867 Merge pull request kubernetes-csi#25 from msau42/fix-master-jobs dc0a5d8 Update kind to v0.5.0 aa85b82 Merge pull request kubernetes-csi#23 from msau42/fix-master-jobs f46191d Kubernetes master changed the way that releases are tagged, which needed changes to kind. There are 3 changes made to prow.sh: 1cac3af Merge pull request kubernetes-csi#22 from msau42/add-1.15-jobs 0c0dc30 prow.sh: tag master images with a large version number f4f73ce Merge pull request kubernetes-csi#21 from msau42/add-1.15-jobs 4e31f07 Change default hostpath driver name to hostpath.csi.k8s.io 4b6fa4a Update hostpath version for sidecar testing to v1.2.0-rc2 ecc7918 Update kind to v0.4.0. This requires overriding Kubernetes versions with specific patch versions that kind 0.4.0 supports. Also, feature gate setting is only supported on 1.15+ due to kind.sigs.k8s.io/v1alpha3 and kubeadm.k8s.io/v1beta2 dependencies. a6f21d4 Add variables for 1.15 db8abb6 Merge pull request kubernetes-csi#20 from pohly/test-driver-config b2f4e05 prow.sh: flexible test driver config 0399988 Merge pull request kubernetes-csi#19 from pohly/go-mod-vendor 066143d build.make: allow repos to use 'go mod' for vendoring 0bee749 Merge pull request kubernetes-csi#18 from pohly/go-version e157b6b update to Go 1.12.4 88dc9a4 Merge pull request kubernetes-csi#17 from pohly/prow 0fafc66 prow.sh: skip sanity testing if component doesn't support it bcac1c1 Merge pull request kubernetes-csi#16 from pohly/prow 0b10f6a prow.sh: update csi-driver-host-path 0c2677e Merge pull request kubernetes-csi#15 from pengzhisun/master ff9bce4 Replace 'return' to 'exit' to fix shellcheck error c60f382 Merge pull request kubernetes-csi#14 from pohly/prow 7aaac22 prow.sh: remove AllAlpha=all, part II 6617773 Merge pull request kubernetes-csi#13 from pohly/prow cda2fc5 prow.sh: avoid AllAlpha=true 546d550 prow.sh: debug failing KinD cluster creation 9b0d9cd build.make: skip shellcheck if Docker is not available aa45a1c prow.sh: more efficient execution of individual tests f3d1d2d prow.sh: fix hostpath driver version check 31dfaf3 prow.sh: fix running of just "alpha" tests f501443 prow.sh: AllAlpha=true for unknown Kubernetes versions 95ae9de Merge pull request kubernetes-csi#9 from pohly/prow d87eccb prow.sh: switch back to upstream csi-driver-host-path 6602d38 prow.sh: different E2E suite depending on Kubernetes version 741319b prow.sh: improve building Kubernetes from source 29545bb prow.sh: take Go version from Kubernetes source 429581c prow.sh: pull Go version from travis.yml 0a0fd49 prow.sh: comment clarification 2069a0a Merge pull request kubernetes-csi#11 from pohly/verify-shellcheck 55212ff initial Prow test job 6c7ba1b build.make: integrate shellcheck into "make test" b2d25d4 verify-shellcheck.sh: make it usable in csi-release-tools 3b6af7b Merge pull request kubernetes-csi#12 from pohly/local-e2e-suite 104a1ac build.make: avoid unit-testing E2E test suite 34010e7 Merge pull request kubernetes-csi#10 from pohly/vendor-check e6db50d check vendor directory fb13c51 verify-shellcheck.sh: import from Kubernetes 94fc1e3 build.make: avoid unit-testing E2E test suite 849db0a Merge pull request kubernetes-csi#8 from pohly/subtree-check-relax cc564f9 verify-subtree.sh: relax check and ignore old content 33d58fd Merge pull request kubernetes-csi#5 from pohly/test-enhancements be8a440 Merge pull request kubernetes-csi#4 from pohly/canary-fix b0336b5 build.make: more readable "make test" output 09436b9 build.make: fix pushing of "canary" image from master branch 147892c build.make: support suppressing checks 154e33d build.make: clarify usage of "make V=1" git-subtree-dir: release-tools git-subtree-split: 4aff857d88149e07951fcd1322f462f765401a86
…pdate Update snapshotter to version 2.0.1
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR adds a "migrated" field to the current sidecar operation metrics to indicate if this operation is from a CSI migration PV. For the current sidecar that is using it, it ensure that the new field is not being added.
To enable this, the sidecar has to create CSIMetricsManager with the correct label or use the
NewCSIMetricsManagerForSidecarWithMigrationStatus()
to create MetricsManager instead.Which issue(s) this PR fixes:
Partially Addressed # kubernetes/kubernetes#98279
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
/cc @pohly @msau42 @saad-ali