-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extend design for how the CVO knows the cluster profile #543
Conversation
CLUSTER_PROFILE=[identifier] | ||
``` | ||
This environment variable would have to be specified in the CVO deployment. When | ||
no `CLUSTER_PROFILE=[identifier]` variable is specified, the `default` cluster profile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the literal "default" doesn't look right to me
I am -1 to exposing the profile - it’s a grouping of deployment characteristics. Operators should be aware of the characteristic (single node) not the profile (CRC, single node production edge). I would be supportive of api changes that expose details onto other config but not clusrer version and profile. Ie “single master” in infrastructure status if we had a really good reason for it. |
|
||
A cluster profile is specified to the CVO as an identifier in, either: | ||
* an environment variable, | ||
* the `status.ClusterProfile` properties in `ClusterVersion`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t see a reason to do this because it exposes the wrong behavior for clients to trigger on. What are the use cases that you would read this value (ie, why are you trying to do this)? What characteristics of the profile are you trying to react to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solution was a reaction of the push back from the installer team. See openshift/installer#3986 (comment)
I don't want to react on this profile. This was just a mean to pass the cluster profile without having to change the installer.
openshift-install create manifests
patch the CV
openshift-install create cluster
For sure, I will be happier if I can pass it directly in the install-config.yaml.
An argument for this new property is that it can be used in insights and telemetry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this focused on the CRC case at the moment? it is disabling insights and telemetry by default....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we do phase 0 of the crc related enhancement, is the patch step enumerated above sufficient to make improved incremental progress without committing to an externally user facing API? it seems like a reasonable first-step to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to use cluster profile for #504 in 4.8, so it's not only CRC oriented
There needs to be a way for the CVO to know which profile to apply in a self-hosted installation. Since the CVO is part of the payload, there isn't a non-API alternative that we see. This value will exist somewhere in the API and our only choice is about where that location is.
This value is intended for use by the ClusterVersionOperator, not by other operators: that's why it is on the ClusterVersion, not on a something like Infrastructure. But in a self-hosted cluster, that intent has to preserved somewhere in the API, there isn't an external source of trust as there is in cases like externally managed clusters. Separately, I do expect that both CRC and SingleNodeProduction are going to want divergent operator behavior, but this field isn't intended to describe that. It is narrowly focused on selecting which bits of the payload to apply. |
The CVO is running code that generates its own next config - that's just as easy to do with an env var as an API from the deployment/pod that creates itself? |
I'm not opposed to the CVO having data it itself manages, I am super concerned about people abusing that. It's hard to have an API field that has godoc that says "You aren't allowed to use this", it's easy if the field doesn't exist. We've had similar problems with other "non API" constructs so I'm just thinking about preventing the problem altogether. |
Env. var is not that easy! The first time we will introduce it, the outgoing will not know about it and won't be able to templatize the manifest. It forces us to add the cluster profile in the struct in one release, and use it in the CVO manifest in a second release. We can workaround that but still not easy. Tried here: openshift/cluster-version-operator@e5ccafa, Update CI job: |
We do this fairly frequently for other things. You could do it and backport it to older releases. And then we just require a higher Z version to upgrade. We do it often enough that I would say it's just as easy to me as adding a new api field without the downsides of people being able to see what the profile is. |
Should I change this enhancement to remove this field from ClusterVersion but add it to InstallConfig ? https://github.com/openshift/installer/blob/master/pkg/types/installconfig.go#L62 |
Is it not already added to install config? How were you planning on telling the installer the profile of the cluster? |
Installer team pushed back on this last July. openshift/installer#3986 |
I think this proposal should include "how does a user specify which profile they want" and the details of that implementation. Setting an env var on the initial deployment is just as valid as a field on the api or a configmap. The profile env var being supported throughout upgrade is already a requirement for hypershift, so CVO continuing to broaden support for it is the current path. If we have reasons not to expose the profile to other operators (except perhaps as role specific variables) then the proposal would be to ensure that CVO properly preserves the profile value via env in all three modes (hypershift, self-hosted HA, and self-hosted singleton) and any arguments about why that won't work should be assessed (as we were doing) |
cdd1e36
to
f4c4e26
Compare
I updated the enhancement to include 2 design proposals:
I hope I covered everything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial reaction is that the following:
openshift-install create manifests
patch the CV
openshift-install create cluster
is a reasonable step to unlock CRC scenarios without complicating the end-user facing API while we learn/iterate. the install-config is not pertinent for the ibm cloud scenario, so i feel like that path gives us the ability to innovate without being stuck to a given API.
Cluster profile will be set like this env. variable. | ||
Upgrade will have to preserve the initial cluster profile. | ||
|
||
###### Hypershift |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would exclude this section.
are you really more interested in capturing how CRC can signal to openshift-install? if so, lets focus on that scenario, but hypershift is not pertinent here per ibm cloud comment above.
|
||
A cluster profile is specified to the CVO as an identifier in, either: | ||
* an environment variable, | ||
* the `status.ClusterProfile` properties in `ClusterVersion`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this focused on the CRC case at the moment? it is disabling insights and telemetry by default....
|
||
A cluster profile is specified to the CVO as an identifier in, either: | ||
* an environment variable, | ||
* the `status.ClusterProfile` properties in `ClusterVersion`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we do phase 0 of the crc related enhancement, is the patch step enumerated above sufficient to make improved incremental progress without committing to an externally user facing API? it seems like a reasonable first-step to me.
@csrwng can you comment as well? |
@derekwaynecarr the issue with patching a manifest after the installer creates manifests is that there really is no good manifest to patch (they don't include the CVO deployment). The CVO deployment is generated by the bootkube script once the install is under way (https://github.com/openshift/installer/blob/ad31070e5a34ea92ce9792b2832fcbf313ec9832/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template#L56-L61). One possibility is to have the installer react to an env var for the profile that then it ends up passing to the cvo render invocation. At least that would not require a change to install-config.yaml. @guillaumerose was something like that explored with the installer team? |
Sorry, missed the part of patching the CV. If we patch the CV, it means that the CV would have to expose the profile which I thought is something we wanted to initially avoid. |
This is a great idea. It keeps cluster profile something hidden but it is a good step forward. Installer team didn't explored that. As discussed with @csrwng, instead of the release in 2 times, we can have 2 sets of manifests: some with {{ .ClusterProfile }} and some without, just for the upgrade. |
f4c4e26
to
5219771
Compare
I removed the proposal with the new field in ClusterVersion. Let's now focus on what to change in the installer. 2 proposals:
Thanks. |
Lets explore 2 use cases that we should address here:
In both cases it sums up not to a profile, but to a deployment time (which is derived from the profile). So I assyme we plan that each component will have different type of manifests, depending on the profile 'grouping'. It will address runtime mode (for example, diffferent CEO manifest that has 'DEPLOYMENT=NonHA' as environment variable for crc/sno profiles. So we can assume that the operators should no read the profile type from the cluster. |
Very good point. Should we say that all components that can render manifests can take |
|
||
When upgrading, outgoing CVO will forward the cluster profile information to the incoming CVO with the environment variable. | ||
|
||
`include.release.openshift.io/[identifier]=true` would make the CVO render this manifest only when `CLUSTER_PROFILE=[identifier]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekwaynecarr is include.release.openshift.io
something we claimed as API? I see this used for IBM and high availability something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfojtik yes
They are not used once the bootstrap is completed, |
Ideally, the render command would be able to conclude the desired state from the bootstrap node. All conditional decisions about the cluster profile would be made during render. So as long as we can conclude the current profile in bootkube we shouldn't have any problems. |
|
For the bootstrap process, I read that that we will introduce a new CRD for that (cluster-config). So we are good with the current design. |
@derekwaynecarr @smarterclayton @wking is the current design OK ? |
5219771
to
9f13f63
Compare
Here is my third proposal. The bootkube.sh script looks for a specific manifest file to exist. If the file exists, the script adds the contents of that file as the CLUSTER_PROFILE environment variable for the bootstrap CVO pod. This makes it something that a user must deliberately do and a particular step in the installation process. |
} | ||
``` | ||
|
||
Option 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
option 2 is preferred as we iterate/learn especially while the bootstrap host is still required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer option 2 as well.
Using the profile during install should trigger a warning. Also, it should be prefixed the same way other install flags are like OPENSHIFT_INSTALL_EXPERIMENTAL_CLUSTER_PROFILE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note we'll have to add this env to cluster-bot once we agree on the flow.
And what will the user need to do to create that manifest? |
9f13f63
to
b4a5ef9
Compare
I removed option 1 and set the good name for the env. variable in the installer. I also changed a bit the releases phases: if we only add |
staebler 2 minutes ago lalatenduM < 1 minute ago Thus from that |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eparis, guillaumerose The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
As required per openshift/api#782 (comment) and openshift/cluster-version-operator#404 (comment).
This is needed to cover all the needs of the CVO and the usage of Cluster Profile with the installer.