Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a condition in ncn-k8s-combined-healthcheck script from goss-test to know its called through iuf and return exit code 1 if there is a test case failure #603

Merged
merged 1 commit into from
Oct 10, 2024

Conversation

Srinivas-Anand-HPE
Copy link
Contributor

@Srinivas-Anand-HPE Srinivas-Anand-HPE commented Oct 4, 2024

goss-test to know its called through iuf and return exit code 1 if there is a test case failure

Summary and Scope

This condition in csm hook script is failing to verify as goss-testing is giving exit code 0 even when there is test case failure

GRAND TOTAL: 715 passed, 11 failed
ERROR: There was at least one test failure
Sample 1
1

FAILED
ncn-m001:~ # echo $?
0

Adding a condition in ncn-k8s-combined-healthcheck script from goss-test to know its called through iuf and return exit code 1 if there is a test case failure and removing condition in ncn-k8s-combined-healthcheck-post-service-upgrade script as Goss Test to use in iuf hook has changed in CASMINST-6906

Is this change backwards incompatible, backwards compatible, or a backwards compatible bugfix?
yes

Issues and Related PRs

List and characterize relationship to Jira/Github issues and other pull requests. Be sure to list dependencies._

Testing

List the environments in which these changes were tested.

Tested on:

  • surtur

Test description:

How were the changes tested and success verified? If schema changes were part of this change, how were those handled in your upgrade/downgrade testing?

  • Were the install/upgrade-based validation checks/tests run (goss tests/install-validation doc)?
    yes
  • Were continuous integration tests run?
    yes
  • Was upgrade tested?
    yes
  • Was downgrade tested?
    yes

Risks and Mitigations

No

Pull Request Checklist

  • Version number(s) incremented, if applicable
  • Copyrights updated
  • License file intact
  • Target branch correct
  • CHANGELOG.md updated
  • Testing is appropriate and complete, if applicable
  • HPC Product Announcement prepared, if applicable

…est to know its called through iuf and return exit code 1 if there is a test case failure
@leliasen-hpe
Copy link
Contributor

leliasen-hpe commented Oct 7, 2024

I think documentation should be added to the step where the health check is run. If the health check fails on the first run, the user should be able to decided if they would like to proceed. I believe it would be frustrating to the user if a health check is failing that they are aware of and cannot fix in that moment and they are unable to proceed with the upgrade because of a small issue.

An instruction should be added that says something like: "If there is a failure in the goss test, look at the output and evaluate the failure. The failure should be fixed so that the goss checks pass. If you would like to proceed with the upgrade for some reason without fixing the failure, use the command below. Note, this is not recommended and must be done only if the system is healthy and the exiting health check failure will not cause problems."

touch /etc/cray/upgrade/csm/${CSM_REL_NAME}/health-checks.done

This documentation should be added to docs-csm so I am approving this PR.

@Srinivas-Anand-HPE Srinivas-Anand-HPE merged commit 004076b into release/1.6 Oct 10, 2024
3 checks passed
@Srinivas-Anand-HPE Srinivas-Anand-HPE deleted the CASMTRIAGE-7320 branch October 10, 2024 01:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants