Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated Workflow for Agent-based Installer #1067

Conversation

dhellmann
Copy link
Contributor

"Cluster 0" deployments, for the first cluster in an environment, are
unique because they occur in situations where there may not be a lot of
other infrastructure to support long-running services normally associated
with automated provisioning for on-premises use cases. Nevertheless,
users and partners want to automate these deployments. The assisted
installer GUI provides an excellent user experience for manually deploying
clusters. This enhancement covers the automation use case.

/priority important-soon

@openshift-ci openshift-ci bot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Mar 21, 2022
@openshift-ci openshift-ci bot requested review from runcom and stbenjam March 21, 2022 21:20
@dhellmann
Copy link
Contributor Author

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 21, 2022

@dhellmann: GitHub didn't allow me to request PR reviews from the following users: julienlim.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @avishayt @celebdor @lranjbar @mhrivnak @patrickdillon @pawanpinjarkar @rwsu @julienlim

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from mhrivnak March 21, 2022 21:23
@dhellmann
Copy link
Contributor Author

I know @rwsu is out this week, so I'll place a hold to ensure he has time to double check that I've captured the details of what he's already done in the prototype correctly.

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 21, 2022
@dhellmann
Copy link
Contributor Author

dhellmann commented Mar 21, 2022

/cc @staebler

let me know if I should add you to the formal reviewer list for this, I'd be happy to do that.

@staebler
Copy link
Contributor

/cc @staebler

let me know if I should add you to the formal reviewer list for this, I'd be happy to do that.

You don't need to add me to the formal review list, but I will add my review.

Copy link
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial overall reaction to this proposal is that I am surprised at the extent of the input needed for the create image command.

@openshift-ci openshift-ci bot requested a review from cybertron March 22, 2022 09:46
andfasano added a commit to andfasano/installer that referenced this pull request Apr 26, 2022
… installer

This patch adds the initial project setup for the agent installer, as described in the OpenShift enhancement openshift/enhancements#1067, to be managed by the agent team
Copy link
Member

@zaneb zaneb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks basically ready to merge to me. I'll give it another day or two before approving.

version management techniques that we do for the APIs within a
cluster. We will use the `install-config.yaml` format for
cluster-wide settings. We will use a new file, tentatively named
`agent-config.yaml`, for settings that apply to only agent-based
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of the agent-config file is to provide a place to add config that otherwise might be in the install-config but is specific to the agent-based installation method, because it needs to be configured in the image before installation starts. I'd expect this to be very small, but potential candidates we've identified are:

  • node0 selection
  • root disk multipathing
  • network config (exists in install-config for baremetal platform, but not for others)
  • hostnames

Some of these have equivalents in the ZTP CRDs, some do not, but all of them are painful to construct by hand.

Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
We will develop in a feature branch, but that detail isn't relevant to
the design in the way that a feature flag would be, so it's not
mentioned in the enhancement now that we've answered the question.

Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
Copy link
Contributor

@patrickdillon patrickdillon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of the new changes, agent-config.yaml seems to have gotten more concrete and LGTM.

Everything here /lgtm from an installer perspective.

@dhellmann
Copy link
Contributor Author

The enhancement template was restructured this week. The Drawbacks section is present in this doc, but does not use the same heading level so the linter is failing.

/override ci/prow/markdownlint

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 28, 2022

@dhellmann: Overrode contexts on behalf of dhellmann: ci/prow/markdownlint

In response to this:

The enhancement template was restructured this week. The Drawbacks section is present in this doc, but does not use the same heading level so the linter is failing.

/override ci/prow/markdownlint

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dhellmann
Copy link
Contributor Author

We agreed in the cluster lifecycle architecture call yesterday to put out a last call for comments on this enhancement, with a goal of merging it by the end of the day on 3 May.

any steps for which the installer is not responsible. Those steps may
also be performed by **cluster creator** if no software component is
available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what Those steps may also be performed by **cluster creator** if no software component is available. means. Do you mean the steps of deployment can be handled manually if no automation exists?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it means you can boot the ISO manually if your are so inclined.

@LalatenduMohanty
Copy link
Member

The PR looks good to me.

@LalatenduMohanty
Copy link
Member

@dhellmann I did not add lgtm label because I am not in the review list. But if you want to add add it then let me know.

@dhellmann
Copy link
Contributor Author

@dhellmann I did not add lgtm label because I am not in the review list. But if you want to add add it then let me know.

Thanks. A simple "lgtm" comment from anyone is fine. Zane can verify that all of the reviewers have left one and add both labels at once when this is ready to merge.


## Proposal

We propose to extend the `openshift-install` binary with a new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some discussion has been happening in a separate FAQ document. An important point came up about validations for optional operators that I think is relevant here. Paraphrasing, it was expressed that pre-installation validations that are specific to operators that handle storage, virt, etc. are more acceptable for the agent-based installation that is available on console.redhat.com, but would be less desirable for an installer that is shipped inside an openshift release due to coupling of requirements across products.

I think that's an important issue to address, especially if deciding to deliver this new install capability through the openshift-install binary is going to limit its value to users. Here was my comment:

The main reason that a small number of operators are selectable on console.redhat.com is for the pre-install validations. If the hardware specs are not sufficient to run ODF, virt, etc, we want to tell the user upfront rather than let them go through a cluster installation, try to install and use the operator (disconnected operator installation is no fun...), and only then fail after all that time and work. The value of ensuring user success upfront was determined to be worth the price of maintaining a small list of validations for a few specific operators. IOW a small bit of coupling improves the UX significantly.

Since a primary use case here is to enable a user who wants to establish a local point of management within their disconnected environment, validating their complete intent upfront seems to make sense. If putting this tool "inside OCP" precludes that sort of thinking, maybe that's not the best delivery model for the use case.

Were other delivery models considered and/or ruled out? What about shipping this as part of multi-cluster engine, where it could provide a complete story for creating a first local point of management? Or maybe we can still provide that complete story even while shipping as part of the OCP payload?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was an early decision that we captured in an internal document but didn't repeat here. It basically came down to wanting to integrate the various tools, rather than create another standalone thing. I'll share that doc with you.

considered another layer of control for the cluster creator, just as
the manifests written by the `create manifests` sub-command are. The
cluster creator will be able to generate those intermediate files with
the sub-command `agent create-manifests`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should block on this, but we will probably want to find another term since the manifests created by agent create manifests have a wholly different role to those created by the regular create manifests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's work that out with the installer team when we write that command and then come back and update this enhancement.

Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
@dhellmann
Copy link
Contributor Author

/override ci/prow/markdownlint

The same linter error from the template update.

enhancements/agent-installer/automated-workflow-for-agent-based-installer.md missing "### Drawbacks"

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 3, 2022

@dhellmann: Overrode contexts on behalf of dhellmann: ci/prow/markdownlint

In response to this:

/override ci/prow/markdownlint

The same linter error from the template update.

enhancements/agent-installer/automated-workflow-for-agent-based-installer.md missing "### Drawbacks"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 3, 2022

@dhellmann: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@zaneb
Copy link
Member

zaneb commented May 4, 2022

The appointed hour passed yesterday and there have been no further comments today, so let's go ahead and merge this so the team is clear on the direction.
/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 4, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 4, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zaneb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 4, 2022
@openshift-merge-robot openshift-merge-robot merged commit a9d7aec into openshift:master May 4, 2022
thejasn pushed a commit to thejasn/enhancements that referenced this pull request May 10, 2022
* Automated Workflow for Agent-based Installer

Summary

"Cluster 0" deployments, for the first cluster in an environment, are
unique because they occur in situations where there may not be a lot
of other infrastructure to support long-running services normally
associated with automated provisioning for on-premises use
cases. Nevertheless, users and partners want to automate these
deployments. The assisted installer GUI provides an excellent user
experience for manually deploying clusters. This enhancement covers
the automation use case.

* use 3 level command structure

change `agent wait-for-install-complete` to `agent wait-for install-complete`

Signed-off-by: Doug Hellmann <dhellmann@redhat.com>

* highlight that this is case relies on user-provided automation

Signed-off-by: Doug Hellmann <dhellmann@redhat.com>

* remove open question about feature flag

We will develop in a feature branch, but that detail isn't relevant to
the design in the way that a feature flag would be, so it's not
mentioned in the enhancement now that we've answered the question.

Signed-off-by: Doug Hellmann <dhellmann@redhat.com>

* remove qualifier in non-goal for collecting telemetry

Signed-off-by: Doug Hellmann <dhellmann@redhat.com>

* don't say "minimum viable" version

* replace hyphen with space in installer sub-command names

Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
lranjbar pushed a commit to lranjbar/installer that referenced this pull request May 28, 2022
… installer

This patch adds the initial project setup for the agent installer, as described in the OpenShift enhancement openshift/enhancements#1067, to be managed by the agent team
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.