-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-2985: Public KRM Functions Registry #2986
Conversation
mengqiy
commented
Sep 22, 2021
- One-line PR description: adding new KEP for public KRM functions registry
- Issue link: Public KRM functions registry #2985
- Other comments:
Publishers are responsible for the security of their KRM functions. Publishers | ||
are responsible for clearly communicating the expectation (e.g. maturity) to | ||
their users (e.g. through a file for the publisher. We should standardize it | ||
eventually). For example, Kustomize wants to provide a small set of carefully |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something about what you mean here, we already have a means of indicating maturity: each function config's GV. I wouldn't expect a publisher's entire catalog to have a single maturity level.
_kustomize/experimental_ for the latter. | ||
|
||
If some functions don't have any publishers, the users should use it at their | ||
own risk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the use case for anonymous publishers as opposed to the "experimental" SIG publisher above? Even if the publisher is an individual Github user, I feel they should still be identified as responsible for the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maintainers are responsible for the functions they contributed no matter even when they are not published with a publisher.
It should be sponsored by SIG-CLI. | ||
|
||
The repo name can be `KRM-functions`, `KRM-function-catalog`, | ||
`KRM-function-registry` or something more reasonable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I like the first and last options, since this will generate more than one catalog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the website is going to be krm-fn-registry.sigs.k8s.io
, I prefer KRM-function-registry
or KRM-fn-registry
for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that we should align the domain and repo name, which implies using lower kebab case. We can mark the exact name as unresolved for now.
apiVersion: kustomize.config.k8s.io/v1beta1 | ||
kind: Kustomization | ||
catalogs: | ||
- https://krm-fn-registry.sigs.k8s.io/catalog.yaml?publisher=kustomize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear to me why I need to specify the global registry. I generally would err on the side of providing reasonable defaults. Having to provide obvious stuff is tedious, discourages quick creation of kustomization.yaml files. The next thing we get is a request from someone to create a kustomization.yaml with all the boilerplate.
Since the way the Registry generates Catalogs will determine what means of consuming Catalog are available to most users, I think we should more explicitly consider what we want them to do. E.g.:
|
I agree a static (versioned) catalog can be useful for declarative purposes and having reproducible build which is especially important for production use. |
Add workflow to publish a function.
The KEP has been updated to include KRM function metadata and workflow to publish a function |
@mikebz GitHub doesn't allow me to re-request review from you. I just ping you here. |
- apiVersion: example.com/v1alpha1 | ||
kind: LegacySetNamespace | ||
deprecated: true | ||
usage: <a URL pointing to a README.md> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this work with a thing like docsify? Docsify wants your markdown files locally AFAIK? So will we then have to develop something that fetches the files when the index changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point!
So will we then have to develop something that fetches the files when the index changes?
I'd avoid it since it will introduce some maintenance burden.
One way to solve the freshness of the index is that we do both on-demand (e.g. merging PR) and cron-based (e.g. nightly) site building.
An alternative we can consider is that we require the contributors to check in these markdown files (usage and examples) in the registry repo and the URL in the metadata would be pointing to the markdown files in the same repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also do both: allow a URL, in which case the website will have a hyperlink instead of inline docs, or a local markdown file, in which case the docs will be embedded into the equivalent field when Catalog is built.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I said using the markdown files in the same repo, I meant we can use a URL pointing the file in same repo like https://raw.githubusercontent.com/kubernetes-sigs/krm-functions-registry/master/path/to/markdown. It will make it easier to construct a Catalog resource than using a relative path.
|
||
```shell | ||
├── publishers | ||
│ ├── communities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would we want to separate those? What advantage does that give us? Hypothetical question - why don't I just have a one long list fo functions at the root level by name and the owner metadata is already specified?
We now introduce a dependency between what's in the file and where it exists. People might forget to move if for instance the publisher changes (gets acquired for instance).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different types of publishers will have slightly different requirements. When we saw a publisher Foo in a function metadata, we need to know the type of Foo (community, company or individual) to determine the verification requirement needed.
I'm open to suggestions about the layout.
We now introduce a dependency between what's in the file and where it exists. People might forget to move if for instance the publisher changes (gets acquired for instance).
Are you worrying about that the image may disappear (e.g. renamed or moved) after publishing the function?
To mitigate that, we can have some cron-based CI like checking nightly to see if the images still exist. If not, notify the maintainers by GH issue and(or) email.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about defining a kind: Publisher
and having that be a second required file in each directory (which could then be flat as Mike suggests)? That kind could have fields for the publisher type, and perhaps some of its fields could also be used to default fields in KRMFunctions during Catalog construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean each publisher need to duplicate thekind: Publisher
file in every function directories?
To avoid duplication, we can have a publisher directory that contains a list of kind: Publisher
files. And we can have a list of functions at the root level directory and function metadata can point to a publisher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the publishers each already have a directory in your proposed structure, I was picturing the publisher.yaml
going in there. But we can iterate on details like that as we figure out what makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to get this in as provisional ahead of my and Jeff's related talk next Wednesday. To that end, can we tag the ongoing discussion points as unresolved and aim for a merge early next week? My main quibbles are either about things that I think are fine to mark unresolved at this stage, or about the schema of KRMFunction, which I think should ultimately move to the Catalog KEP, because they need to match IMO (maybe I'm missing something about why they shouldn't?).
It should be sponsored by SIG-CLI. | ||
|
||
The repo name can be `KRM-functions`, `KRM-function-catalog`, | ||
`KRM-function-registry` or something more reasonable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that we should align the domain and repo name, which implies using lower kebab case. We can mark the exact name as unresolved for now.
function. We will only support container-based KRM functions in the public | ||
registry. | ||
|
||
Ideally, the content under field `spec` should be able to be used directly in a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just ideal, or is it required? When should they differ, given that a catalog needs to be programatically constructed from this data? I'd expect the Registry's catalog compilation to essentially just be collecting a set of these and inserting them into the catalog's functions field. And if it HAS to match, then one KEP or the other (Catalog, IMO) should be the source of truth for it before either is moved to implementable, at the latest.
Currently Catalog has a definition:
field that presumably points to a separate artifact with most of the information you have below. I think I agree with inlining it, but I also think we should make the structure closer to a CRD. That's really what a function is after all: a client-side CR, and kind: KRMFunction
is shaping up to be the CRD analogue. I think we should embrace this analogy on principle, but there are also some issues below that the CRD structure would solve. Notably, in CRD, most fields fall under items in a versions
list. That should be the case for most of the fields you have under spec
directly right now: schema, runtime, examples, licence, idempotent, and license all could change with the version.
Please mark this entire "Function Metadata" section as unresolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the wording.
Marked this section as "unresolved"
``` | ||
|
||
The following is an example for exec-based KRM function. We will not allow | ||
contributors to publish exec-based KRM functions. But we want to standardize the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should allow them in the registry even if we don't publish any of the first-party ones this way. Otherwise they're effectively not supported. Orchestrators can make their own choices about what to run/prefer. On the other hand, we should require all verification-related fields to be populated for all exec functions in the registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should allow them in the registry even if we don't publish any of the first-party ones this way. Otherwise they're effectively not supported.
How strong do you feel about allowing publishing exec-based KRM functions from day 1?
Orchestrators can make their own choices about what to run/prefer.
Orchestrators should be secure by default.
Ideally kustomize should have a --allow-exec-plugins
flag or something similar. Imagine we have the following use case:
A kustomize user is trying to achieve a task. The user finds a kustomization.yaml from internet (e.g. stackoverflow) that claims it can solve it. kustomization.yaml
uses a catalog that contains exec-based plugins. When the user runs kustomize build
, it should ask the user to set the --allow-exec-plugins
flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How strong do you feel about allowing publishing exec-based KRM functions from day 1?
As long as the schema we propose supports it in theory, I think it's reasonable to implement container-based only in the initial pass and e.g. require exec support for beta. We can outline this in the rollout plan in the follow-up PR that moves this KEP to implementable.
Ideally kustomize should have a --allow-exec-plugins flag or something similar. Imagine we have the following use case:
A kustomize user is trying to achieve a task. The user finds a kustomization.yaml from internet (e.g. stackoverflow) that claims it can solve it. kustomization.yaml uses a catalog that contains exec-based plugins. When the user runs kustomize build, it should ask the user to set the --allow-exec-plugins flag.
This is a discussion for the "plugin graduation" KEP, but as a quick tl;dr it is proposing that Kustomize require the end user to explicitly trust the catalog rather than authorize categories of runtimes. E.g. they want to trust one specific exec plugin that that kustomization from the internet requires, and they've checked it out specifically--we should not ask them to authorize all exec plugins. By requiring them to type --trusted-catalog=[the catalog they've vetted]
, they are authorizing a specific set of things that they can audit and that will not silently change. If the Kustomization is remote and gets changed to use additional plugins, Kustomize will fail the build and complain that the catalog the user is trusting does not include the new ones. Another way to think about is that a catalog reference in a Kustomization says to Kustomize "you need this" and the command-line flag says "you can use this". If the latter isn't a superset of the former, the build fails.
- apiVersion: example.com/v1alpha1 | ||
kind: LegacySetNamespace | ||
deprecated: true | ||
usage: <a URL pointing to a README.md> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also do both: allow a URL, in which case the website will have a hyperlink instead of inline docs, or a local markdown file, in which case the docs will be embedded into the equivalent field when Catalog is built.
|
||
```shell | ||
├── publishers | ||
│ ├── communities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about defining a kind: Publisher
and having that be a second required file in each directory (which could then be flat as Mike suggests)? That kind could have fields for the publisher type, and perhaps some of its fields could also be used to default fields in KRMFunctions during Catalog construction.
@KnVerey I will update the KEP and mark unsolved as UNRESOLVED. We can aim to merge it as provisional before the KubeCon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: KnVerey, mengqiy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |