Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] (review-only) WIP: OLM v1 API review #2067

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
219 changes: 219 additions & 0 deletions olm/catalogd/v1alpha1/clustercatalog_types.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
package v1alpha1

import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// SourceType defines the type of source used for catalogs.
// +enum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this tag actually does anything, it may help the openapi generator perhaps

If you are intending to have this be an enum in a CRD, you need to have

// +kubebuilder:validation:Enum:=Image;Foo;Bar;...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll plan to remove the tag. Where this is used in the types we do have the appropriate enum tag.

type SourceType string

const (
SourceTypeImage SourceType = "Image"

TypeProgressing = "Progressing"
TypeServing = "Serving"

// Serving reasons
ReasonAvailable = "Available"
ReasonUnavailable = "Unavailable"
ReasonDisabled = "Disabled"

// Progressing reasons
ReasonSucceeded = "Succeeded"
ReasonRetrying = "Retrying"
ReasonBlocked = "Blocked"

MetadataNameLabel = "olm.operatorframework.io/metadata.name"

AvailabilityEnabled = "Enabled"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabled/Disabled needs more dimension, and communicate better to those with less op-con context.
Proposed were something like "AvailableToBeUsed" and its reverse.

AvailabilityDisabled = "Disabled"
Comment on lines +29 to +30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are for enums, use a string type alias as you have done with SourceType

)

//+kubebuilder:object:root=true
//+kubebuilder:resource:scope=Cluster
//+kubebuilder:subresource:status
//+kubebuilder:printcolumn:name=LastUnpacked,type=date,JSONPath=`.status.lastUnpacked`
//+kubebuilder:printcolumn:name="Serving",type=string,JSONPath=`.status.conditions[?(@.type=="Serving")].status`
//+kubebuilder:printcolumn:name=Age,type=date,JSONPath=`.metadata.creationTimestamp`

// ClusterCatalog enables users to make File-Based Catalog (FBC) catalog data available to the cluster.
// For more information on FBC, see https://olm.operatorframework.io/docs/reference/file-based-catalogs/#docs
type ClusterCatalog struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`

Spec ClusterCatalogSpec `json:"spec"`
Status ClusterCatalogStatus `json:"status,omitempty"`
Comment on lines +46 to +47
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these should have godoc

Godoc should start with the json tag version of the name

Both should also have either // +kubeubilder:validation:Required or // +optional. We add an explicit optional vs required as there are ways to accidentally change this if it isn't specified explicitly (you can change the default at the package level, and there are other inferences that decide as well)

}

//+kubebuilder:object:root=true

// ClusterCatalogList contains a list of ClusterCatalog
type ClusterCatalogList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata"`

Items []ClusterCatalog `json:"items"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Godoc missing here too

}

// ClusterCatalogSpec defines the desired state of ClusterCatalog
// +kubebuilder:validation:XValidation:rule="!has(self.source.image.pollInterval) || (self.source.image.ref.find('@sha256:') == \"\")",message="cannot specify PollInterval while using digest-based image"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this here rather than at the image level?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason as far as I can tell. We should move it.

type ClusterCatalogSpec struct {
// source is a required field that allows the user to define the source of a Catalog that contains catalog metadata in the File-Based Catalog (FBC) format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be tempted to break this down into short and direct sentences. It's generally easier to take in when reading.

I think you probably want the FBC link where you mention FBC, rather than disjointing it at the end of the godoc.

Suggested change
// source is a required field that allows the user to define the source of a Catalog that contains catalog metadata in the File-Based Catalog (FBC) format.
// source allows the user to define the source of a catalog.
// The catalog source must contain catalog metadata in the File-Based Catalog (FBC) format.
// For more information on FBC, see https://olm.operatorframework.io/docs/reference/file-based-catalogs/#docs.
// source is a required field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much prior knowledge about catalogs are you assuming in this godoc? I am completely new to OLM so I may ask some stupid questions (but perhaps end users are asking the same?)

What is in a catalog? Why would I want to define a catalog source?

I'm gonna guess the answer might be something like

// Operators listed in the specified source will be added to the OLM dashboard.
// Once listed, operators in the catalog will be available to be installed in the cluster.

I guess what I'm looking for is, what does the user achieve by adding a catalog source

//
// Below is a minimal example of a ClusterCatalogSpec that sources a catalog from an image:
//
// source:
// type: Image
// image:
// ref: quay.io/operatorhubio/catalog:latest
//
// For more information on FBC, see https://olm.operatorframework.io/docs/reference/file-based-catalogs/#docs
Source CatalogSource `json:"source"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there ever be a use case for having multiple sources, like mirrors of the same catalog?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't envision a use case for this right now, but if they do arise, we think that there would be a reasonable way to implement something like this on a per source level without it being a breaking change. That being said, we haven't done a deep dive into how or what that design would look like. The existing OLM v0 CatalogSource API only supports a single image source per CatalogSource and we have not had any RFEs for being able to specify multiple.

For the mirroring use case in particular, we already respect the standard mirroring configurations when using things like ITMS, IDMS, etc. in OpenShift.


// priority is an optional field that allows the user to define a priority for a ClusterCatalog.
// A ClusterCatalog's priority is used by clients as a tie-breaker between ClusterCatalogs that meet the client's requirements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if two catalogs meet the requirement and specify the same priority?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be up to the client to handle the tie. I can make this more clear in the GoDoc.

The main client in this case, operator-controller, manages the ClusterExtension API and uses the catalog contents to find the content to install for the ClusterExtension refuses to do anything in the face of ambiguity and will put a message on the ClusterExtension status along the lines of both catalogs a,b provided bundles that meet the constraints. refusing to pick one. The ClusterExtension API has a way to filter catalogs used when trying to parse these requirements.

// For example, in the case where multiple ClusterCatalogs provide the same bundle.
// A higher number means higher priority. Negative numbers are also accepted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you accept negative numbers?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chatted more with @tmshort who originally added the field and the reasoning for this was to follow the approach of PodPriority and other priority related fields in the Kubernetes APIs.

// When omitted, the default priority is 0.
// +kubebuilder:default:=0
// +optional
Priority int32 `json:"priority"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the maximum value of this field? For that matter, what is the minimum?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minimum would be -2147483648 since it is the largest negative number that a signed int32 can have. Maximum would be 214748647.


// Availability is an optional field that allows users to define whether the ClusterCatalog is utilized by the operator-controller.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain without references to a specific client and speak more to the general use-case.

//
// Allowed values are : ["Enabled", "Disabled"].
// If set to "Enabled", the catalog will be used for updates, serving contents, and package installations.
//
// If set to "Disabled", catalogd will stop serving the catalog and the cached data will be removed.
//
// If unspecified, the default value is "Enabled"
//
// +kubebuilder:validation:Enum="Disabled";"Enabled"
// +kubebuilder:default="Enabled"
// +optional
Availability string `json:"availability,omitempty"`
Comment on lines +84 to +96
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Available should probably be a value, not a field name. A possibility

Available: Operators in this catalog are useable on the cluster.
Removed: Operators in this catalog are not usable on the cluster.

}

// ClusterCatalogStatus defines the observed state of ClusterCatalog
type ClusterCatalogStatus struct {
// conditions is a representation of the current state for this ClusterCatalog.
// The status is represented by a set of "conditions".
Comment on lines +101 to +102
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two sentences are basically saying the same thing, do we need both?

//
// Each condition is generally structured in the following format:
// - Type: a string representation of the condition type. More or less the condition "name".
// - Status: a string representation of the state of the condition. Can be one of ["True", "False", "Unknown"].
// - Reason: a string representation of the reason for the current state of the condition. Typically useful for building automation around particular Type+Reason combinations.
// - Message: a human-readable message that further elaborates on the state of the condition.
Comment on lines +104 to +108
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need to explain the format of a condition, since conditions are pretty universal across Kube, I'd expect people to know the format

//
// The current set of condition types are:
// - "Serving", which represents whether or not the contents of the catalog are being served via the HTTP(S) web server.
// - "Progressing", which represents whether or not the ClusterCatalog is progressing towards a new state.
//
// The current set of reasons are:
// - "Succeeded", this reason is set on the "Progressing" condition when progressing to a new state is successful.
// - "Blocked", this reason is set on the "Progressing" condition when the ClusterCatalog controller has encountered an error that requires manual intervention for recovery.
Comment on lines +115 to +116
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are both Progressing False right? One is happy, the other is sad. I wonder if there's a way to represent the states we want where all happy states and all sad states share the opposite True/False state?

Or, is there some third condition required to identify when a progression is not required, and its safe for me to not worry about that condition?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct that these both result in Progressing being set to False. If I am remembering correctly, we only have the Blocked reason to handle the case where an invalid image pull spec is provided meaning direct user intervention is required to fix the issue.

I think adding constraints to the ClusterCatalog.Spec.Source.Image.Ref field would allow us to be more confident that the pull spec is a valid format and alleviate a "terminal" error state like this.

@joelanford please correct me if my understanding of why we added this Blocked reason is wrong.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this a bit more and looked at how the Deployment API handles this case. We were thinking we could handle it similarly where Progressing is only set to False when it encounters a degraded state, otherwise we are always Progressing == True with an appropriate reason to reflect the state of that progression (i.e we have reached the desired state or are continuing towards the desired state).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that sound reasonable or are there other examples that you think might be better to take a look at?

// - "Retrying", this reason is set on the "Progressing" condition when the ClusterCatalog controller has encountered an error that might be resolvable on subsequent reconciliation attempts.
// - "Available", this reason is set on the "Serving" condition when the contents of the ClusterCatalog are being served via an endpoint on the HTTP(S) web server.
// - "Unavailable", this reason is set on the "Serving" condition when there is not an endpoint on the HTTP(S) web server that is serving the contents of the ClusterCatalog.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would I know if this is an error state, or intentionally unavailable?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current understanding is that the Serving condition should be taken at face value. Any error states are communicated in the Progressing condition. If the Serving condition is set to intentionally unavailable, via "disabling" the ClusterCatalog, I would expect a human readable message in the condition to state as such.

//
// +optional
Conditions []metav1.Condition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type" protobuf:"bytes,1,rep,name=conditions"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need listType and listMapKey tags on this slice

// resolvedSource contains information about the resolved source based on the source type.
//
// Below is an example of a resolved source for an image source:
// resolvedSource:
//
// image:
// lastSuccessfulPollAttempt: "2024-09-10T12:22:13Z"
// ref: quay.io/operatorhubio/catalog@sha256:c7392b4be033da629f9d665fec30f6901de51ce3adebeff0af579f311ee5cf1b
// type: Image
Comment on lines +126 to +131
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling this might change, I'm not sure how useful an example is for a status. May be better to explain in prose what the content might be, but I think we need to clarify the ResolvedSource structure first

// +optional
ResolvedSource *ResolvedCatalogSource `json:"resolvedSource,omitempty"`
// urls contains the URLs that can be used to access the catalog.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When might there be more than one URL?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC from some of our conversations around this, we were imagining a use case where we add discovery endpoints (OpenAPI/Swagger) to the REST API and thought that this status field could be used to communicate to a user the endpoints being used for discovery for each version of the API being served.

@grokspawn or @joelanford may have some more input here.

// +optional
URLs *CatalogURLs `json:"urls,omitempty"`
everettraven marked this conversation as resolved.
Show resolved Hide resolved
// lastUnpacked represents the time when the
// ClusterCatalog object was last unpacked successfully.
Comment on lines +137 to +138
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is unpacking?

// +optional
LastUnpacked metav1.Time `json:"lastUnpacked,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be a pointer for the omitempty to work for metav1.Time

}

type CatalogURLs struct {
// base is a required cluster-internal URL from which on-cluster components can access the API endpoint for this catalog
Base string `json:"base"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the maximum length of this string?

You say this must be a URL, use CEL to validate that it is a valid URL

Are there any other requirements of the base URL? Must it be https for example?

}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular type is still a work in progress. For now we intend to only provide a base URL in which a user can build on top of to perform requests against the supported REST API for accessing this ClusterCatalog's content.

For more information, see https://docs.google.com/document/d/1XvQPwk-Ws1rVcBpNsAaDoYKlmlws-hiQ19Ex9D4zAok/edit?usp=sharing

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant types changes here merged in operator-framework/catalogd#429 and released in v0.34.0 of catalogd.

Copy link
Contributor

@JoelSpeed JoelSpeed Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a shame, as you haven't added much validation to the field.

You would have benefited from adding CEL to validate that it was a valid URL, and imposing a maximum length to the field. I suppose we can add that for v1, though conversion might be tricky

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC our current thinking is that, even though it isn't following best practices for API versioning, we can do a hard cut over to the v1 versions of the API when we release v1.0.0 of the upstream projects and GA in OpenShift 4.18 (although @joelanford please correct me if this is an incorrect understanding).

With that in mind, we can still add some validations without having to worry about conversion complexity.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section has been updated to reflect the most recent changes to the types that were merged for this API upstream. Leaving this thread unresolved in case there is more discussion required re: conversion/API versioning

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you expecting the transitional path to be for users of the existing v1alpha1 APIs? A hard cut does not sound like there is necessarily a migration path?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great question :) - my current understanding is that there is none. My general understanding is that doing a hard cut:

  • Has no impact to supported OpenShift customers since these APIs only exist when TechPreviewNoUpgrade is enabled meaning there is nothing to migrate when going GA in 4.18
  • May have impact to users of the upstream project, but we seem to be OK with not providing a migration path there because, following SemVer, users should expect a breaking change going from a v0.Y.Z release to a v1.Y.Z release.

I don't think this approach is set in stone, but I'll have another chat with @joelanford later this afternoon and see if there is any expectations that we provide that transitional path and that we are OK with the consequences of not providing that path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May have impact to users of the upstream project, but we seem to be OK with not providing a migration path there because, following SemVer, users should expect a breaking change going from a v0.Y.Z release to a v1.Y.Z release.

If the cluster tries to upgrade to a new OLM, that is installing the v1 CRD, without the v1alpha1 version included, this would in theory mean that the data stored in etcd already, now has no representation in the v1 API, so, I think you'd lose all data there.

Also, any controllers running during this change, would break until they are upgraded to the new version of the API.

I'm not even sure if Kube will allow you to make that change to a CRD once its installed

I think you'll need to try this out if you want to say that v0 users aren't going to be provided with a migration path, it could mean they have to completely uninstall the operator, rather than upgrade, which in turn, means perhaps uninstalling the other operators installed by OLM right? The UX of that could be pretty bad


// CatalogSource is a discriminated union of possible sources for a Catalog.
// CatalogSource contains the sourcing information for a Catalog
// +union
// +kubebuilder:validation:XValidation:rule="self.type == 'Image' && has(self.image)",message="source type 'Image' requires image field"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This forces the value of type to required, which is fine, but, would need to be changed if you ever added a new value

We typically use

Suggested change
// +kubebuilder:validation:XValidation:rule="self.type == 'Image' && has(self.image)",message="source type 'Image' requires image field"
// +kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'Image' ?has(self.image) : !has(self.image)",message="image is required when type is Image, and forbidden otherwise"

You can then copy/paste this for new fields in the union without changing this one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for this suggestion. That will be helpful for having a reasonable "message" when we add new types.

type CatalogSource struct {
// type is a required reference to the type of source the catalog is sourced from.
//
// Allowed values are ["Image"]
//
// When this field is set to "Image", the ClusterCatalog content will be sourced from an OCI image.
// When using an image source, the image field must be set and must be the only field defined for this type.
Comment on lines +157 to +162
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other unions we've added lately, the godoc would look something like. It's not a major change but I'd definitely suggest splitting things like the allowed values into a sentence rather than using ["blah","blah"] format. Think of users rather than developers

Suggested change
// type is a required reference to the type of source the catalog is sourced from.
//
// Allowed values are ["Image"]
//
// When this field is set to "Image", the ClusterCatalog content will be sourced from an OCI image.
// When using an image source, the image field must be set and must be the only field defined for this type.
// type identifies the source type from which to fetch the catalog.
// The only allowed value is "Image".
// When set to "Image", the ClusterCatalog content will be sourced from a specified OCI image.
// When using an image source, the image field must be set and must be the only field defined for this type.

//
// +unionDiscriminator
// +kubebuilder:validation:Enum:="Image"
// +kubebuilder:validation:Required
Type SourceType `json:"type"`
// image is used to configure how catalog contents are sourced from an OCI image. This field must be set when type is set to "Image" and must be the only field defined for this type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say it must be the only field defined for this type, that's not strictly true, because type is also required.

Perhaps using the phrasing we use for the CEL would be better?

Suggested change
// image is used to configure how catalog contents are sourced from an OCI image. This field must be set when type is set to "Image" and must be the only field defined for this type.
// image is used to configure how catalog contents are sourced from an OCI image.
// This field is required when type is Image, and forbidden otherwise.

// +optional
Image *ImageSource `json:"image,omitempty"`
}

// ResolvedCatalogSource is a discriminated union of resolution information for a Catalog.
// ResolvedCatalogSource contains the information about a sourced Catalog
// +union
// +kubebuilder:validation:XValidation:rule="self.type == 'Image' && has(self.image)",message="source type 'Image' requires image field"
type ResolvedCatalogSource struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments I've left on CatalogSource largely apply here too

// type is a reference to the type of source the catalog is sourced from.
//
// It will be set to one of the following values: ["Image"].
//
// When this field is set to "Image", information about the resolved image source will be set in the 'image' field.
//
// +unionDiscriminator
// +kubebuilder:validation:Enum:="Image"
// +kubebuilder:validation:Required
Type SourceType `json:"type"`
// image is a field containing resolution information for a catalog sourced from an image.
Image *ResolvedImageSource `json:"image"`
}

// ResolvedImageSource provides information about the resolved source of a Catalog sourced from an image.
type ResolvedImageSource struct {
// ref contains the resolved sha256 image ref containing Catalog contents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps

Suggested change
// ref contains the resolved sha256 image ref containing Catalog contents.
// ref contains the resolved image reference for the Catalog content in sha256 digest format.

Do we want to explain why it is the sha256 reference rather than a tag/alias?

Ref string `json:"ref"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a required tag, and a maximum length

What is the format of a sha256 image reference? Is this just the sha256 ref? In which case, we should be able to validate the characters and length fairly easily?

// lastSuccessfulPollAttempt is the time when the resolved source was last successfully polled for new content.
LastSuccessfulPollAttempt metav1.Time `json:"lastSuccessfulPollAttempt"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fields like this end up at as standing write load. Suggest using conditions instead with a threshold for number of failures before declaring a thing stale.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this. Later, we want possibly to add conditions to capture 'staleness' of catalog contents.

}

// ImageSource enables users to define the information required for sourcing a Catalog from an OCI image
type ImageSource struct {
// ref is a required field that allows the user to define the reference to a container image containing Catalog contents.
// Examples:
// ref: quay.io/operatorhubio/catalog:latest # image reference
// ref: quay.io/operatorhubio/catalog@sha256:c7392b4be033da629f9d665fec30f6901de51ce3adebeff0af579f311ee5cf1b # image reference with sha256 digest
Comment on lines +204 to +205
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere in o/api there is an api that validates valid references, we should find that and copy their validation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one in o/api is part of the catalog mirroring effort, it validates something slightly different

There are some constraints we can validate on here though right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some further investigation and the containers/image library that we use for parsing these references uses this regex:

^((?:(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9])(?:(?:\.(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9]))+)?(?::[0-9]+)?/)?[a-z0-9]+(?:(?:(?:[._]|__|[-]*)[a-z0-9]+)+)?(?:(?:/[a-z0-9]+(?:(?:(?:[._]|__|[-]*)[a-z0-9]+)+)?)+)?)(?::([\w][\w.-]{0,127}))?(?:@([A-Za-z][A-Za-z0-9]*(?:[-_+.][A-Za-z][A-Za-z0-9]*)*[:][[:xdigit:]]{32,}))?$

It's quite complex, but I'll see if I can break it down into some CEL expressions similar to how you mentioned we could attempt to break down some of our other pattern validations

Ref string `json:"ref"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to set a maximum length, even if arbitrary. Can we research to see what a suitable maximum length might be?

// pollInterval is an optional field that allows the user to set the interval at which the image source should be polled for new content.
// It must be specified as a duration.
// It must not be specified for a catalog image referenced by a sha256 digest.
// Examples:
// pollInterval: 1h # poll the image source every hour
// pollInterval: 30m # poll the image source every 30 minutes
// pollInterval: 1h30m # poll the image source every 1 hour and 30 minutes
//
// When omitted, the image will not be polled for new content.
// +kubebuilder:validation:Format:=duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested that this actually works? And that non-duration like fields are rejected at admission time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to change this field to pollIntervalMinutes to avoid any pitfalls of using a duration.

Copy link

@grokspawn grokspawn Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding a lower-bound to this, or adding CEL validation, or ... better yet, change to pollIntervalMinutes which sidesteps the problem.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a force-refresh field, where if user changes the value from the spec we will take an action, which fulfills the user need to trigger an update w/o forcing a polling interval change

// +optional
PollInterval *metav1.Duration `json:"pollInterval,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower bound on the poll interval

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 minute as the lower bound sounds good as discussed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the resync period for the informers in the controller that will be processing this? Does a full resync of the informer cause this to poll?

I'm wondering if the resync of the informer (which might be 10m, 10h, 24h...) actually sets an upper bound for the poll interval

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PollIntervalMinutes

}

func init() {
SchemeBuilder.Register(&ClusterCatalog{}, &ClusterCatalogList{})
}
20 changes: 20 additions & 0 deletions olm/catalogd/v1alpha1/groupversion_info.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// Package v1alpha1 contains API Schema definitions for the olm v1alpha1 API group
// +kubebuilder:object:generate=true
// +groupName=olm.operatorframework.io
package v1alpha1

import (
"k8s.io/apimachinery/pkg/runtime/schema"
"sigs.k8s.io/controller-runtime/pkg/scheme"
)

var (
// GroupVersion is group version used to register these objects
GroupVersion = schema.GroupVersion{Group: "olm.operatorframework.io", Version: "v1alpha1"}

// SchemeBuilder is used to add go types to the GroupVersionKind scheme
SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}

// AddToScheme adds the types in this group-version to the given scheme.
AddToScheme = SchemeBuilder.AddToScheme
)
Loading