Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/monitoring/metrics: add new alert for vms using outdated machine type #3106

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dasionov
Copy link
Contributor

@dasionov dasionov commented Sep 25, 2024

What this PR does / why we need it:

This PR introduces a new Prometheus alert for virtual machines (VMs) in the cluster that require a machine type update. For instance, VMs configured with the machine type set to RHEL 8 would be flagged as outdated. This is because the base image of the virt-launcher will transition to RHEL 10, which introduces breaking changes incompatible with older machine types.

Depends-On #3132, kubevirt/kubevirt#13010

Reviewer Checklist

  • PR Message
  • Commit Messages
  • How to test
  • Unit Tests
  • Functional Tests
  • User Documentation
  • Developer Documentation
  • Upgrade Scenario
  • Uninstallation Scenario
  • Backward Compatibility
  • Troubleshooting Friendly

Release note:

None

@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Sep 25, 2024
@dasionov dasionov marked this pull request as draft September 25, 2024 20:18
@kubevirt-bot kubevirt-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 25, 2024
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sradco for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from 6479be4 to d011735 Compare September 25, 2024 20:21
@nunnatsa
Copy link
Collaborator

Is HCO the right place for th his metric? HCO does not know VMs at all and dhould not monitor them.

/hold

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 25, 2024
@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from d011735 to f60673b Compare September 26, 2024 12:49
@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from f60673b to 6f66639 Compare September 27, 2024 01:46
@kubevirt-bot kubevirt-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/S and removed size/L labels Sep 27, 2024
@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from 6f66639 to b4385e1 Compare September 27, 2024 01:53
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 27, 2024
@coveralls
Copy link
Collaborator

coveralls commented Sep 27, 2024

Pull Request Test Coverage Report for Build 11310572661

Details

  • 10 of 40 (25.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.2%) to 72.006%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/monitoring/rules/alerts/operator_alerts.go 10 40 25.0%
Totals Coverage Status
Change from base Build 11252860062: -0.2%
Covered Lines: 5970
Relevant Lines: 8291

💛 - Coveralls

@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from b4385e1 to e2300e3 Compare September 29, 2024 14:39
@dasionov dasionov marked this pull request as ready for review September 29, 2024 14:44
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 29, 2024
@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch 2 times, most recently from 27e58ff to 06e9d6c Compare September 29, 2024 18:06
@dasionov dasionov marked this pull request as draft September 30, 2024 10:27
@kubevirt-bot
Copy link
Contributor

@dasionov: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test build-hco-test-utils-image
  • /test pull-hyperconverged-cluster-operator-e2e-k8s-1.30
  • /test pull-hyperconverged-cluster-operator-e2e-k8s-1.31

Use /test all to run the following jobs that were automatically triggered:

  • pull-hyperconverged-cluster-operator-e2e-k8s-1.30
  • pull-hyperconverged-cluster-operator-e2e-k8s-1.31

In response to this:

/test hco-e2e-operator-sdk-aws

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@dasionov
Copy link
Contributor Author

dasionov commented Oct 1, 2024

/cc @machadovilaca

@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from db4b317 to b786e1f Compare October 1, 2024 12:28
@dasionov dasionov changed the title pkg/monitoring/metrics: add new metric for outdated_machine_type pkg/monitoring/metrics: add new alert for outdated_machine_type vms Oct 6, 2024
@dasionov dasionov changed the title pkg/monitoring/metrics: add new alert for outdated_machine_type vms pkg/monitoring/metrics: add new alert for vms using outdated machine type Oct 6, 2024
@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch 3 times, most recently from c37373a to cc93761 Compare October 7, 2024 23:07
@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 8, 2024
@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from cc93761 to c502d6a Compare October 9, 2024 09:44
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 9, 2024
@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from c502d6a to e771199 Compare October 9, 2024 13:17
@dasionov
Copy link
Contributor Author

dasionov commented Oct 9, 2024

/cc @enp0s3

@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch 2 times, most recently from 6e44f0a to 96d49d8 Compare October 9, 2024 15:49
@dasionov dasionov marked this pull request as ready for review October 10, 2024 11:36
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 10, 2024
@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from 96d49d8 to 9dc6ff6 Compare October 10, 2024 13:29
@dasionov
Copy link
Contributor Author

/retest-required

@dasionov
Copy link
Contributor Author

/retest

@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from 9dc6ff6 to 49034b3 Compare October 13, 2024 00:54
- Introduce new alert for VMs using an outdated machine type.

- Machine types are considered outdated if they are no longer compatible
  due to changes in the virt-launcher OS version. These VMs must be
  updated with supported machine types to ensure compatibility and avoid
  potential issues.

- Add a functional test to verify the alert is triggered when VMs with
  outdated machine types are detected.

Signed-off-by: Daniel Sionov <dsionov@redhat.com>
@dasionov dasionov force-pushed the add_metric_for_vms_with_deprecated_machine_type branch from 49034b3 to 037c422 Compare October 13, 2024 02:09
Copy link

sonarcloud bot commented Oct 13, 2024

Copy link

openshift-ci bot commented Oct 13, 2024

@dasionov: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-aws 037c422 link true /test hco-e2e-consecutive-operator-sdk-upgrades-aws

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@machadovilaca
Copy link
Member

Is HCO the right place for th his metric? HCO does not know VMs at all and dhould not monitor them.

/hold

and also I have some concerns if it makes sense to check for RHEL versions here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note-none Denotes a PR that doesn't merit a release note. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants