Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instrumentation of driver_requests_* metrics #153

Merged
merged 6 commits into from
Mar 4, 2024

Conversation

saley89
Copy link
Contributor

@saley89 saley89 commented Jan 9, 2024

What this PR does / why we need it:
In version v0.50.0 of MCM new metrics were added to record driver request durations and failures. It is for the provider's to implement the logic of updating these metrics at the appropriate points in the code base based on their driver activity with the provider.

This PR is based on the machine-controller-manager-provider-azure implementation of these metrics and adds the duration of requests and records any failures during these operations:

  • create machine
  • delete machine
  • list machines
  • get machine status
  • get volume ids

Which issue(s) this PR fixes:
Fixes #152

Special notes for your reviewer:

See the linked machine-controller-manager-provider-azure project and whether we are able to take their instrument.go function as is.

Release note:

Implements the driver metrics added to MCM in version `0.50.0` such that duration of calls to AWS and any failed requests are recorded:
* driver_request_duration_seconds
* driver_requests_failed_total

@saley89 saley89 requested review from a team as code owners January 9, 2024 09:51
@gardener-robot gardener-robot added the needs/review Needs review label Jan 9, 2024
@gardener-robot
Copy link

@saley89 Thank you for your contribution.

@gardener-robot gardener-robot added the size/s Size of pull request is small (see gardener-robot robot/bots/size.py) label Jan 9, 2024
@gardener-robot-ci-2
Copy link
Contributor

Thank you @saley89 for your contribution. Before I can start building your PR, a member of the organization must set the required label(s) {'reviewed/ok-to-test'}. Once started, you can check the build status in the PR checks section below.

@gardener-robot gardener-robot added size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) and removed size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Jan 9, 2024
@rishabh-11 rishabh-11 self-assigned this Jan 25, 2024
@rishabh-11
Copy link
Contributor

rishabh-11 commented Jan 26, 2024

Hey @saley89, with mcm v0.50 we also have introduced metrics for cloud provider API calls. I have raised an issue (#156) for recording these in provider-aws. Would you like to record these metrics as a part of this PR or wait for the MCM team to implement it when we pick it up?

@saley89
Copy link
Contributor Author

saley89 commented Jan 26, 2024

Hi @rishabh-11 my team has a need for the metrics that are implemented by this change and I noticed it was relatively straight forward to add these in based on the implementation in the other provider.

If you are comfortable for this PR to go in for these metrics without the others that would be beneficial for us. The other metrics could go in as and when you have time/are able to do so.

Copy link
Contributor

@rishabh-11 rishabh-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the instrument.go file in a separate directory called instrument inside pkg?

@gardener-robot gardener-robot added the needs/changes Needs (more) changes label Jan 27, 2024
@saley89
Copy link
Contributor Author

saley89 commented Feb 1, 2024

@rishabh-11 sure, that is done now.

@saley89 saley89 requested a review from rishabh-11 February 1, 2024 08:59
Copy link
Contributor

@rishabh-11 rishabh-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a minor change. rest is good. Also did you test this? Are the metrics working as expected?

@saley89
Copy link
Contributor Author

saley89 commented Feb 8, 2024

Regarding tests I think it can be done. Looks like the way to go about it would be to leverage the testutils package of the client_go/prometheus package we already have in this repo. Some of example of using it are in the prometheus repo.

I had a brief look this afternoon to see if we can introduce it to the existing core_test.go tests but would need more time to get something ready for the PR. In the meantime anyone can contribute to it if they have a solution.

My thinking would be ideally to get the metric, call methods like CreateMachine, DeleteMachine and see that the metric with the relevant labels has indeed been incremented. Unfortunately the Azure example hasn't tested it either for reference.

@rishabh-11
Copy link
Contributor

Hey @saley89, While testing your PR I found a bug in the code related to metrics. Can you refer to gardener/machine-controller-manager-provider-azure#130 and fix this PR?

@rishabh-11
Copy link
Contributor

/ping @saley89

@gardener-robot
Copy link

@saley89 ℹ️ please take some time to help rishabh-11 or redirect to someone else if you can't.

@gardener-robot gardener-robot added the needs/rebase Needs git rebase label Feb 22, 2024
@gardener-robot
Copy link

@saley89 You need rebase this pull request with latest master branch. Please check.

@gardener-robot gardener-robot added size/xl Size of pull request is huge (see gardener-robot robot/bots/size.py) needs/second-opinion Needs second review by someone else and removed size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Feb 22, 2024
@saley89
Copy link
Contributor Author

saley89 commented Feb 22, 2024

@rishabh-11 I was able to pull in the changes needed for the updated driver API function. I needed to revendor due to go mod changes to bring in the latest prometheus golang client.

I will rebase with master to fix these conflicts.

saley89 and others added 2 commits February 22, 2024 11:00
Co-authored-by: Rishabh Patel <66425093+rishabh-11@users.noreply.github.com>
@saley89 saley89 force-pushed the instrument-driver-metrics branch from e33d389 to 335170a Compare February 22, 2024 11:01
@gardener-robot gardener-robot added size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) and removed size/xl Size of pull request is huge (see gardener-robot robot/bots/size.py) labels Feb 22, 2024
@saley89
Copy link
Contributor Author

saley89 commented Feb 22, 2024

The rebase has got things back in order and this is now ready for your review again. Thanks

@saley89 saley89 requested a review from rishabh-11 February 22, 2024 11:05
@saley89 saley89 requested a review from unmarshall February 27, 2024 08:54
@rishabh-11 rishabh-11 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 28, 2024
@gardener-robot-ci-2 gardener-robot-ci-2 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 28, 2024
@rishabh-11 rishabh-11 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Mar 4, 2024
@gardener-robot-ci-3 gardener-robot-ci-3 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Mar 4, 2024
Copy link
Contributor

@rishabh-11 rishabh-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/changes Needs (more) changes needs/rebase Needs git rebase needs/review Needs review needs/second-opinion Needs second review by someone else labels Mar 4, 2024
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Mar 4, 2024
Copy link
Contributor

@unmarshall unmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@rishabh-11 rishabh-11 merged commit 92a9cad into gardener:master Mar 4, 2024
7 checks passed
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Mar 4, 2024
@saley89 saley89 deleted the instrument-driver-metrics branch March 4, 2024 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement newly added "driver_requests" metrics
6 participants