Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a machineconfig for IPI pointer ignition customizations #4413

Merged

Conversation

hardys
Copy link
Contributor

@hardys hardys commented Nov 24, 2020

In the case where an IPI user customizes the pointer config, this
config is only persisted via the user-data secret, which was an
issue when moving to an MCO templated pointer config:

#4228

It was also a problem for attempts for IPI baremetal to consume
the MCO rendered config directly, with the aim of enabling
network configuration via MachineConfig:

#3589

With this new approach, we detect the case where a change has
been made to the pointer config by the user, and in that case
it is stored via a MachineConfig manifest, such that any
customizations are reflected in the MCO rendered config.

Implements: openshift/enhancements#540
Co-Authored-By: Steven Hardy shardy@redhat.com

@hardys
Copy link
Contributor Author

hardys commented Nov 24, 2020

/cc @stbenjam @cgwalters @runcom

@hardys hardys force-pushed the ignition-update-check branch 2 times, most recently from 484c6b2 to 9ce7539 Compare November 25, 2020 18:35
@hardys
Copy link
Contributor Author

hardys commented Nov 26, 2020

/retest

Copy link
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Is the expectation at this point that the user-modified ignition config will still be the one placed on the machine? I assume that the fields in the MachineConfig duplicating the user-supplied fields of the ignition config will just merge together without issue.
  2. There is an outstanding bug regarding worker machines coming up with only some of the MachineConfigs, specifically without the MachineConfig for the image content sources. This can cause the worker machines to get stuck and never completing. Do you think that moving user-supplied configuration to a MachineConfig will exacerbate this issue?

pkg/asset/ignition/machine/masterignitioncheck.go Outdated Show resolved Hide resolved
pkg/asset/ignition/machine/masterignitioncheck.go Outdated Show resolved Hide resolved
@hardys
Copy link
Contributor Author

hardys commented Dec 1, 2020

Thanks for the review @staebler - I'll address the name/Load issues, reply to questions below:

Is the expectation at this point that the user-modified ignition config will still be the one placed on the machine? I assume that the fields in the MachineConfig duplicating the user-supplied fields of the ignition config will just merge together without issue.

Initially yes, the existing pointer ignition will be used as before, and yes my testing indicates any duplicate fields get merged together by the MCO ignition merge logic without issue.

The plan is for baremetal IPI to switch to consuming the rendered config instead and no longer use the pointer config - see #4427

Also I'm expecting the work that was reverted around the MCO managed pointer ignition to be restored (not sure if that's on the radar for 4.7 though @runcom ?) ref #4228 - then any user customizations would only be provided via the MachineConfig, since the MCO pointer config is statically templated.

There is an outstanding bug regarding worker machines coming up with only some of the MachineConfigs, specifically without the MachineConfig for the image content sources. This can cause the worker machines to get stuck and never completing. Do you think that moving user-supplied configuration to a MachineConfig will exacerbate this issue?

Yes this could potentially be an issue, since we'll read the rendered config slightly earlier (in the machine API actuator just before building the machine, vs after the first-boot of the machine) openshift/cluster-api-provider-baremetal#127

We'll have to test and confirm if that race impacts things, and potentially we could add some workaround to poll for a stable rendered config in the actuator until that bug gets fixed.

I don't think that needs to block this PR but thanks for the info - I was not aware of that bug yet.

In the case where an IPI user customizes the pointer config, this
config is only persisted via the user-data secret, which was an
issue when moving to an MCO templated pointer config:

openshift#4228

It was also a problem for attempts for IPI baremetal to consume
the MCO rendered config directly, with the aim of enabling
network configuration via MachineConfig:

openshift#3589

With this new approach, we detect the case where a change has
been made to the pointer config by the user, and in that case
it is stored via a MachineConfig manifest, such that any
customizations are reflected in the MCO rendered config.

Implements: openshift/enhancements#540
Co-Authored-By: Steven Hardy <shardy@redhat.com>
@hardys
Copy link
Contributor Author

hardys commented Dec 1, 2020

/retest

Copy link
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look fine.

My last concern is around the user experience with an installed cluster. Since we are pushing the user-supplied changes as MachineConfigs, it may give the user the impression that they can then edit or remove those MachineConfigs. However, with the changes still present in the pointer ignition configs, removing the MachineConfigs will have no effect.

@hardys
Copy link
Contributor Author

hardys commented Dec 1, 2020

My last concern is around the user experience with an installed cluster. Since we are pushing the user-supplied changes as MachineConfigs, it may give the user the impression that they can then edit or remove those MachineConfigs. However, with the changes still present in the pointer ignition configs, removing the MachineConfigs will have no effect.

Yes that's true, but my understanding is that when the MCO managed pointer ignition changes are restored (ref #4228 and openshift/machine-config-operator#2126) we won't expect these customizations to exist in pointer configs, since it's a static template managed via the MCO.

In that model IMO it's clearer to have any customizations reflected via MachineConfigs, and in fact some changes to those configs would be possible day-2 without the "hidden" config that potentially exists in the user-data secret today?

In the enhancement discussions @cgwalters mentioned that for IPI deployments we shouldn't support manually customizing the Ignition config, only injecting MachineConfig - I see this solution as a step towards that, e.g it happens internally now, then if we want to we can warn users to use MachineConfig directly in future?

It'd be good to get some feedback from @runcom and @cgwalters to confirm my understanding above.

@staebler
Copy link
Contributor

staebler commented Dec 1, 2020

/approve
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 1, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: staebler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 1, 2020
@cgwalters
Copy link
Member

In the enhancement discussions @cgwalters mentioned that for IPI deployments we shouldn't support manually customizing the Ignition config, only injecting MachineConfig - I see this solution as a step towards that, e.g it happens internally now, then if we want to we can warn users to use MachineConfig directly in future?

Yeah, I think this is going to effectively only be used by current baremetal IPI. Does BMIPI currently wrap openshift-install? If so perhaps we could set an environment variable to turn this on explicitly just for that?

Since we are pushing the user-supplied changes as MachineConfigs, it may give the user the impression that they can then edit or remove those MachineConfigs. However, with the changes still present in the pointer ignition configs, removing the MachineConfigs will have no effect.

That's a good point. I think in practice though the users won't want to change or delete this stuff because it's e.g. core networking bonding configuration.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@stbenjam
Copy link
Member

stbenjam commented Dec 2, 2020

In the enhancement discussions @cgwalters mentioned that for IPI deployments we shouldn't support manually customizing the Ignition config, only injecting MachineConfig - I see this solution as a step towards that, e.g it happens internally now, then if we want to we can warn users to use MachineConfig directly in future?

Yeah, I think this is going to effectively only be used by current baremetal IPI. Does BMIPI currently wrap openshift-install? If so perhaps we could set an environment variable to turn this on explicitly just for that?

We don’t wrap it. Users can just invoke the installer. There’s a separate baremetal installer as part of the release payload, but only because we have to currently build it with libvirt and cgo.

@openshift-merge-robot
Copy link
Contributor

@hardys: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-crc c976be9 link /test e2e-crc
ci/prow/e2e-aws-fips c976be9 link /test e2e-aws-fips

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit dcceaa6 into openshift:master Dec 2, 2020
@hardys
Copy link
Contributor Author

hardys commented Dec 2, 2020

Yeah, I think this is going to effectively only be used by current baremetal IPI. Does BMIPI currently wrap openshift-install? If so perhaps we could set an environment variable to turn this on explicitly just for that?

We don't wrap the installer, and IMO this does provide value beyond BM IPI - it's a requirement for restoring the MCO managed pointer config work from @runcom so we don't break ignition customizations for IPI users on any platform, was there some other plan to deal with that as it wasn't mentioned on the enhancement?

@runcom
Copy link
Member

runcom commented Dec 2, 2020

We don't wrap the installer, and IMO this does provide value beyond BM IPI - it's a requirement for restoring the MCO managed pointer config work from @runcom so we don't break ignition customizations for IPI users on any platform, was there some other plan to deal with that as it wasn't mentioned on the enhancement?

there wasn't afaict, the reverted work that I did isn't required for this tho (I want to mention it because it seemed we're doing that in order to restore that but this can go w/o that)

@cgwalters
Copy link
Member

it's a requirement for restoring the MCO managed pointer config work from @runcom so we don't break ignition customizations for IPI users on any platform, was there some other plan to deal with that as it wasn't mentioned on the enhancement?

I don't think we need to extensively debate this in a closed PR but bigger picture, I don't agree that we need this for MCO managed pointer config.

Broadly speaking I still see just two classes of usage for the pointer config:

  • UPI: (particularly with per-machine customizations) Here MCO cannot change it, and per discussion above the user probably can't/shouldn't either
  • IPI: Completely managed by the MCO and should not be edited by the user at all

In particular the "completely managed by MCO" case is necessary for things like openshift/enhancements#443 as well as openshift/enhancements#201

It's just current BMIPI which is creating this special case of "edited by user in IPI scenario" I think.

hardys pushed a commit to hardys/installer that referenced this pull request Dec 2, 2020
I missed updating this in openshift#4413 which introduced a new
asset
hardys pushed a commit to hardys/installer that referenced this pull request Dec 2, 2020
I missed updating this in openshift#4413 which introduced a new
asset

Generated via:
bin/openshift-install graph | dot -Tsvg >docs/design/resource_dep.svg
hardys pushed a commit to hardys/installer that referenced this pull request Dec 2, 2020
This restores the work which was previously done via openshift#3276
but then reverted via openshift#3589 due to breaking users who customized
the pointer ignition config in IPI deployments.

A solution to that has been proposed via openshift#4413 - see openshift/enhancements#540 for more details.

Note that some additional changes beyond the initial implementation
were required due to the MCO now supporting multiple
ignition versions, thus this depends on openshift-metal3/terraform-provider-ironic#46

Co-Authored-By: Steven Hardy <shardy@redhat.com>
hardys pushed a commit to hardys/installer that referenced this pull request Dec 2, 2020
This restores the work which was previously done via openshift#3276
but then reverted via openshift#3589 due to breaking users who customized
the pointer ignition config in IPI deployments.

A solution to that has been proposed via openshift#4413 - see openshift/enhancements#540 for more details.

Note that some additional changes beyond the initial implementation
were required due to the MCO now supporting multiple
ignition versions, thus this depends on openshift-metal3/terraform-provider-ironic#46

Co-Authored-By: Steven Hardy <shardy@redhat.com>
@cgwalters
Copy link
Member

cgwalters commented Dec 3, 2020

Hmm I just did a libvirt 4.7 install and I see this show up. I definitely didn't hand-edit the pointer configs, and it's basically empty:

walters@toolbox /v/s/w/s/g/o/installer (s3-explicit-region)> oc get -o yaml machineconfig/99-installer-ignition-master
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  creationTimestamp: "2020-12-03T15:12:24Z"
  generation: 1
  labels:
    machineconfiguration.openshift.io/role: master
  managedFields:
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:machineconfiguration.openshift.io/role: {}
      f:spec:
        .: {}
        f:config:
          .: {}
          f:ignition:
            .: {}
            f:security:
              .: {}
              f:tls:
                .: {}
                f:certificateAuthorities: {}
            f:version: {}
        f:extensions: {}
        f:fips: {}
        f:kernelArguments: {}
        f:kernelType: {}
        f:osImageURL: {}
    manager: cluster-bootstrap
    operation: Update
    time: "2020-12-03T15:12:24Z"
  name: 99-installer-ignition-master
  resourceVersion: "1362"
  selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigs/99-installer-ignition-master
  uid: 426c1ad2-d4e2-485e-bd39-52abe2919b76
spec:
  config:
    ignition:
      security:
        tls:
          certificateAuthorities:
          - source: data:text/plain;charset=utf-8;base64,LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURFRENDQWZpZ0F3SUJBZ0lJUkM1U3JjMW1LNlV3RFFZSktvWklodmNOQVFFTEJRQXdKakVTTUJBR0ExVUUKQ3hNSmIzQmxibk5vYVdaME1SQXdEZ1lEVlFRREV3ZHliMjkwTFdOaE1CNFhEVEl3TVRJd016RTFNRFUxTTFvWApEVE13TVRJd01URTFNRFUxTTFvd0pqRVNNQkFHQTFVRUN4TUpiM0JsYm5Ob2FXWjBNUkF3RGdZRFZRUURFd2R5CmIyOTBMV05oTUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUF5cGRaSTdUM1NobHkKdUkwSEhhdk9rcDQzK0cxRzFTaGFxQ3ppUG02cVJPUzlrQTlQZThwL2RHcFJjdGptWWwyc2MwU2JNakN5SHBQeQpid1VkRG1Qc2V5dVJWZEdjU3BFWlc1QXBmM3ptbnZWMnpmbEs2dGR5azJ6SlRmSzMrTGZ6bE1iOWdQbjI2bllZClNYeFowQ3NwalFMSHBrZHRZYzhCb1AvcE9RQU54TjFqSTZDRUlnRXVaaGNMdW93MytsQjlJdmlpYTRqSWdWcFIKeWlZUE5lZkFhYXBTRGhKcGFwTFhtSWFTN3Z2aDUwcVdrZzF2RlkvQzNNczFjMzdkZ0FMS3I3QUJvQWtJWG13UApCUUZUdTFTS25OL1d5LzJUK1FkSWtLRU9nUmhDUFRxL01xTGVaaUZqREt2RU5qV0Nudi9qbWsrYy9pMVNVQ3lkCnllaCtZeVZPWndJREFRQUJvMEl3UURBT0JnTlZIUThCQWY4RUJBTUNBcVF3RHdZRFZSMFRBUUgvQkFVd0F3RUIKL3pBZEJnTlZIUTRFRmdRVWlIYzJwcFR2NzQzQVZ5MXRKRk92bXI4aTZ1VXdEUVlKS29aSWh2Y05BUUVMQlFBRApnZ0VCQUtvMXdna1AwWGFGZTZRb3NIUTlWWTNQWno0bFlmYW9aL3ljVWNzaCs0TUdwZktQN2toQzNQSUJ3cGZmCm9VNmltZWZDWXNDSkszdk4xTEpNaXRnY0RrT2dlT0xIT29EQUNVVWZQWXl3YXU5aWVwMTdMVmtiMlExL2RKTDQKUHc4OVFFTmZOQVJ5VjBabzh4ODNPdVZXLzNDTFhQMk1OUDBndXNqVzd5TTdaVEkxMTJ6VXl3Q3QwVXE0ZEluZQpJbVczUHNrT2VQelphcDVKWGlrOXF3U2tCN1dSNDFnQW13TGVrSXk0Q0FWcUVDTk1FQWhaT2pBVWl0QUJYRzlvCkEwZ09OS05JenpHUmVCWWpEQWF0OWRqdVpRRW91aEZ1UnBobHF5Nk9reWZNNUNsaUdNUW9zZkRaN25uVUh5UmYKakxmcmNZVDBHNExpSzJxeWdobjJkbHdpeUVrPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
      version: 3.1.0
  extensions: null
  fips: false
  kernelArguments: null
  kernelType: ""
  osImageURL: ""
walters@toolbox /v/s/w/s/g/o/installer (s3-explicit-region)> 

@cgwalters
Copy link
Member

cgwalters commented Dec 3, 2020

Yeah this is happening across the board, digging into e.g. this 4.7 nightly in the e2e logs for a stock AWS IPI install I see:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.7/1334471421937586176/artifacts/e2e-aws/installer/.openshift_install.log

time="2020-12-03T12:25:59Z" level=info msg="Master pointer ignition was modified. Saving contents to a machineconfig"

@hardys
Copy link
Contributor Author

hardys commented Dec 3, 2020

Yeah this is happening across the board, digging into e.g. this 4.7 nightly in the e2e logs for a stock AWS IPI install I see:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.7/1334471421937586176/artifacts/e2e-aws/installer/.openshift_install.log

time="2020-12-03T12:25:59Z" level=info msg="Master pointer ignition was modified. Saving contents to a machineconfig"

Thanks @cgwalters evidently we need a better test e.g https://github.com/openshift/installer/blob/master/pkg/asset/ignition/machine/master_ignition_customizations.go#L48 /cc @kirankt

Unless it's causing issues for other platforms, should we raise a bz to track that and we'll find a fix ASAP?

@cgwalters
Copy link
Member

Yep, a BZ sounds fine, I don't think we need to revert or anything. There shouldn't be any harm from this, but let's get it fixed for 4.7 final.

hardys pushed a commit to hardys/installer that referenced this pull request Dec 4, 2020
An issue was spotted after openshift#4413 landed, the config comparison
always evaluates to true, so we're generating empty machineconfig
objects even when no customization has happened.

Update to compare a json representation of the configs, and add
a test to confirm all works as expected.
@hardys
Copy link
Contributor Author

hardys commented Dec 4, 2020

Yep, a BZ sounds fine, I don't think we need to revert or anything. There shouldn't be any harm from this, but let's get it fixed for 4.7 final.

I pushed #4455 - will raise a bz if it doesn't merge before FF so we can be sure it's resolved before final 4.7

hardys pushed a commit to hardys/installer that referenced this pull request Dec 4, 2020
An issue was spotted after openshift#4413 landed, the config comparison
always evaluates to true, so we're generating empty machineconfig
objects even when no customization has happened.

Update to compare a json representation of the configs, and add
a test to confirm all works as expected.
michaelgugino pushed a commit to mgugino-upstream-stage/installer that referenced this pull request Dec 4, 2020
I missed updating this in openshift#4413 which introduced a new
asset

Generated via:
bin/openshift-install graph | dot -Tsvg >docs/design/resource_dep.svg
michaelgugino pushed a commit to mgugino-upstream-stage/installer that referenced this pull request Dec 4, 2020
This restores the work which was previously done via openshift#3276
but then reverted via openshift#3589 due to breaking users who customized
the pointer ignition config in IPI deployments.

A solution to that has been proposed via openshift#4413 - see openshift/enhancements#540 for more details.

Note that some additional changes beyond the initial implementation
were required due to the MCO now supporting multiple
ignition versions, thus this depends on openshift-metal3/terraform-provider-ironic#46

Co-Authored-By: Steven Hardy <shardy@redhat.com>
hardys pushed a commit to hardys/installer that referenced this pull request Dec 5, 2020
An issue was spotted after openshift#4413 landed, the config comparison
always evaluates to true, so we're generating empty machineconfig
objects even when no customization has happened.

Update to compare a json representation of the configs, and add
a test to confirm all works as expected.
hardys pushed a commit to hardys/installer that referenced this pull request Dec 7, 2020
An issue was spotted after openshift#4413 landed, the config comparison
always evaluates to true, so we're generating empty machineconfig
objects even when no customization has happened.

Update to compare a json representation of the configs, and add
a test to confirm all works as expected.
hardys pushed a commit to hardys/installer that referenced this pull request Dec 9, 2020
An issue was spotted after openshift#4413 landed, the config comparison
always evaluates to true, so we're generating empty machineconfig
objects even when no customization has happened.

Update to compare a json representation of the configs, and add
a test to confirm all works as expected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants