Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve CatalogSource controller error handling during reconciliation #6

Closed
everettraven opened this issue Mar 27, 2023 · 2 comments
Closed
Assignees
Milestone

Comments

@everettraven
Copy link
Collaborator

Upon reconciliation of a CatalogSource resource, the CatalogSource controller does a few things:

  • Creates a Job to unpack catalog contents
  • Once unpack Job finishes, reads logs from the Job Pod to get the catalog contents
  • Creates Package CRs for each package in the catalog
  • Creates BundleMetadata CRs for each bundle in the catalog

Currently, the CatalogSource controller's error handling is very basic and should be updated to handle the following scenarios:

  • IF the unpack Job can not be created
    • Update the CatalogSource resource status indicating the unpack failure
  • IF the unpack Job's Pod's logs can not be read
    • Update the CatalogSource resource status indicating the unpack failure
    • IF the error is because the Pod no longer exists, requeue to attempt unpacking process again as the Job could have been cleaned up by another process
  • IF a Package CR can not be created
    • IF caused by the Package already existing - continue
    • ELSE (?)
      • Update the CatalogSource resource status to indicate failure to create children resources
      • Cleanup already created child Package CRs
  • IF a BundleMetadata CR can not be created
    • IF caused by the BundleMetadata already existing - continue
    • ELSE (?)
      • Update the CatalogSource resource status to indicate failure to create children resources
      • Cleanup already created children BundleMetadata & Package CRs

Note: All the scenarios here are just proposed solutions. Ones marked with (?) are ones I feel could have better solutions

@joelanford
Copy link
Member

catalogd v0.1.0 and OLMv1 M3 (operator-controller v0.1.0) were already released, so removing the olm-v1/m3 label and moving to the v0.2.0 catalogd milestone.

@anik120
Copy link
Collaborator

anik120 commented Jun 20, 2023

Looks like a lot of the to-do items mentioned above have been addressed in #65 and #83.

The only remaining issue that I can think of that hasn't been addressed yet is the atomicity of syncing Package/Bundlemetadata objects for a catalog that I've captured in #100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants