Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

controller: Stop panicking when unable to create pod. #615

Merged

Conversation

diconico07
Copy link
Contributor

What this PR does / why we need it:
Currently the controller panic if it encounters an error while trying to create a pod/job for an instance, thus causing an impact far wider than just the failing instance.

This should only have impact on the said instance, so as a first step, this commit stop the controller from panicking in this case and just output a log entry.

This fixes #525

Special notes for your reviewer:
This is a first step toward better user error handling the next steps could be:

  • Identify if there are errors that should still make the controller panic
  • Generate a kubernetes event indicating failure to create broker
  • Add a status field to Instance (and Configuration ?), so Akri can send feedback to the user this way. This item would need a proposal and further discussion.

If applicable:

  • this PR has an associated PR with documentation in akri-docs
  • this PR contains unit tests
  • added code adheres to standard Rust formatting (cargo fmt)
  • code builds properly (cargo build)
  • code is free of common mistakes (cargo clippy)
  • all Akri tests succeed (cargo test)
  • inline documentation builds (cargo doc)
  • all commits pass the DCO bot check by being signed off -- see the failing DCO check for instructions on how to retroactively sign commits

@diconico07 diconico07 marked this pull request as ready for review July 18, 2023 07:40
@diconico07 diconico07 force-pushed the stop-crashing-controller-wrong-config branch from 313e757 to 041cffb Compare July 20, 2023 09:38
@bfjelds
Copy link
Collaborator

bfjelds commented Jul 20, 2023

Add a status field to Instance (and Configuration ?), so Akri can send feedback to the user this way. This item would need a proposal and further discussion.

This is a great idea.

@diconico07 diconico07 force-pushed the stop-crashing-controller-wrong-config branch from 041cffb to 3236715 Compare July 21, 2023 08:55
Currently the controller panic if it encounters an error while trying to
create a pod/job for an instance, thus causing an impact far wider than
just the failing instance.

This should only have impact on the said instance, so as a first step,
this commit stop the controller from panicking in this case and just
output a log entry.

Signed-off-by: Nicolas Belouin <nicolas.belouin@suse.com>
Signed-off-by: Nicolas Belouin <nicolas.belouin@suse.com>
@diconico07 diconico07 force-pushed the stop-crashing-controller-wrong-config branch from 3236715 to 70ce091 Compare July 24, 2023 07:12
@diconico07 diconico07 merged commit 51b3a05 into project-akri:main Jul 24, 2023
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Controller crashing when akri config is bad
2 participants