Skip to content
This repository has been archived by the owner on Jul 23, 2019. It is now read-only.

Review of install-config structure #57

Closed
russellb opened this issue Apr 24, 2019 · 25 comments · Fixed by #93
Closed

Review of install-config structure #57

russellb opened this issue Apr 24, 2019 · 25 comments · Fixed by #93

Comments

@russellb
Copy link
Member

I'd like to discuss any changes we might want to make to the baremetal platform config in the install-config.yaml file before we submit our changes back to openshift/installer. When looking over the current structure, I had a few thoughts:

  1. We use the name nodes, which matches Ironic terminology, but is confusing for OpenShift, where a Node is something else. I suggest we use hosts or even bareMetalHosts, aligning with the Kubernetes objects that this info relates to.

  2. We have master_nodes. Note that openshift-install dropped the use of masters and workers terminology. Instead, it now uses compute and controlPlane. We should align with that.

  3. There is information about a given host listed in different sections. For example, openshift-master-0 is listed 4 times. Could we restructure this so that all information about openshift-master-0 is together under a single heading?

  4. Finally, I think for the purposes of this config, Ironic should be considered an implementation detail. I wonder if we should do anything to hide Ironic-isms. For example, I'm looking at driver, management_interface, power_interface, and vendor_interface. Maybe that could be simplified with a single option like managementType: ipmi. We could define additional management types over time that map to a set of options on the backend.

For reference, here is a copy of an existing install-config.yaml:

apiVersion: v1beta4
baseDomain: test.metalkube.org
metadata:
  name: ostest
compute:
- name: worker
  replicas: 1
controlPlane:
  name: master
  replicas: 3
platform:
  baremetal:
    nodes:
      master_nodes:
        openshift-master-0:
          name: openshift-master-0
          port_address: "00:bd:3e:96:4f:3d"
          driver: "ipmi"
          management_interface: "ipmitool"
          power_interface: "ipmitool"
          vendor_interface: "no-vendor"
        openshift-master-1:
          name: openshift-master-1
          port_address: "00:bd:3e:96:4f:41"
          driver: "ipmi"
          management_interface: "ipmitool"
          power_interface: "ipmitool"
          vendor_interface: "no-vendor"
        openshift-master-2:
          name: openshift-master-2
          port_address: "00:bd:3e:96:4f:45"
          driver: "ipmi"
          management_interface: "ipmitool"
          power_interface: "ipmitool"
          vendor_interface: "no-vendor"
      properties:
        openshift-master-0:
          local_gb: "50"
          cpu_arch: "x86_64"
        openshift-master-1:
          local_gb: "50"
          cpu_arch: "x86_64"
        openshift-master-2:
          local_gb: "50"
          cpu_arch: "x86_64"
      root_devices:
        openshift-master-0:
          name: "/dev/sda"
        openshift-master-1:
          name: "/dev/sda"
        openshift-master-2:
          name: "/dev/sda"
      driver_infos:
        openshift-master-0:
          ipmi_port: "6230"
          ipmi_username: "admin"
          ipmi_password: "password"
          ipmi_address: "192.168.111.1"
          deploy_kernel:  "http://172.22.0.1/images/ironic-python-agent.kernel"
          deploy_ramdisk: "http://172.22.0.1/images/ironic-python-agent.initramfs"
        openshift-master-1:
          ipmi_port: "6231"
          ipmi_username: "admin"
          ipmi_password: "password"
          ipmi_address: "192.168.111.1"
          deploy_kernel:  "http://172.22.0.1/images/ironic-python-agent.kernel"
          deploy_ramdisk: "http://172.22.0.1/images/ironic-python-agent.initramfs"
        openshift-master-2:
          ipmi_port: "6232"
          ipmi_username: "admin"
          ipmi_password: "password"
          ipmi_address: "192.168.111.1"
          deploy_kernel:  "http://172.22.0.1/images/ironic-python-agent.kernel"
          deploy_ramdisk: "http://172.22.0.1/images/ironic-python-agent.initramfs"
    master_configuration:
      image_source: "http://172.22.0.1/images/rhcos-ootpa-latest.qcow2"
      image_checksum: 0e8e57da5ed9f95c9f4469f2456beec9
      root_gb: 25
pullSecret: ...
sshKey: ...
@dhellmann
Copy link
Member

  1. Finally, I think for the purposes of this config, Ironic should be considered an implementation detail. I wonder if we should do anything to hide Ironic-isms. For example, I'm looking at driver, management_interface, power_interface, and vendor_interface. Maybe that could be simplified with a single option like managementType: ipmi. We could define additional management types over time that map to a set of options on the backend.

Do we need those other interface values at all? The baremetal operator is only setting the driver.

@stbenjam
Copy link
Member

stbenjam commented May 2, 2019

  1. Finally, I think for the purposes of this config, Ironic should be considered an implementation detail. I wonder if we should do anything to hide Ironic-isms. For example, I'm looking at driver, management_interface, power_interface, and vendor_interface. Maybe that could be simplified with a single option like managementType: ipmi. We could define additional management types over time that map to a set of options on the backend.

Do we need those other interface values at all? The baremetal operator is only setting the driver.

I'm not sure why they were added - I think they were added by @derekhiggins when we enabled idrac support in openshift-metal3/dev-scripts@bad8da7#diff-589e5ae70d225e720b2445d3447fe986.

@stbenjam
Copy link
Member

stbenjam commented May 2, 2019

We use the name nodes, which matches Ironic terminology, but is confusing for OpenShift, where a Node is something else. I suggest we use hosts or even bareMetalHosts, aligning with the Kubernetes objects that this info relates to.

This sounds good to me. How about we make the baremetal platform config for the installer just look like the CRD?
https://github.com/metal3-io/baremetal-operator/blob/master/docs/api.md.

There is information about a given host listed in different sections. For example, openshift-master-0 is listed 4 times. Could we restructure this so that all information about openshift-master-0 is together under a single heading?

I believe we can use much more structured data now that @yrobla fixed the issues with using nested data structures, so we can probably use whatever structure we want.

@russellb
Copy link
Member Author

We use the name nodes, which matches Ironic terminology, but is confusing for OpenShift, where a Node is something else. I suggest we use hosts or even bareMetalHosts, aligning with the Kubernetes objects that this info relates to.

This sounds good to me. How about we make the baremetal platform config for the installer just look like the CRD?
https://github.com/metal3-io/baremetal-operator/blob/master/docs/api.md.

Yes! Big +1 to using the CRD here.

@russellb
Copy link
Member Author

russellb commented May 22, 2019

Here's a draft alternative install-config.yaml based on the contents included in the original post. This is to get some feedback on the structure before implementing anything.

Here are the key changes:

  1. Re-use existing data structures for BareMetalHosts and Secrets (for BMC credentials).

  2. Drop a number of Ironic specific configuration items, instead following the lead of the BareMetalHost API and however these things will be exposed there. I imagine we'll have to build in knowledge of a set of hardware profiles into the installer. These could come from a vendored copy of baremetal-operator. We may want to vendor it anyway to get the BareMetalHost data structure.

  3. Drop the master_configuration section. This seems like something the installer should be handling in some other automatic way. I'd probably address it separately from the re-work of the hosts data, though.

  4. I'd like to have the ability to specify the details of more than just the first 3 master hosts. We should be able to specify as many hosts as we know about, and the installer will pick 3 to start with, and register the rest with the cluster. I added a convention of using labels to give hints to the installer about which ones we'd like it to choose as masters, if needed.

  5. (Edit) Updated the original version of this post to include an example of a 4th baremetal host that would get provisioned automatically as a worker.

apiVersion: v1beta4
baseDomain: test.metalkube.org
metadata:
  name: ostest
compute:
- name: worker
  replicas: 1
controlPlane:
  name: master
  replicas: 3
pullSecret: ...
sshKey: ...
platform:
  baremetal:
    secrets:
      - apiVersion: v1
        metadata:
          name: openshift-master-0-bmc-secret
        data:
          username: admin
          password: password
      - apiVersion: v1
        metadata:
          name: openshift-master-1-bmc-secret
        data:
          username: admin
          password: password
      - apiVersion: v1
        metadata:
          name: openshift-master-2-bmc-secret
        data:
          username: admin
          password: password
      - apiVersion: v1
        metadata:
          name: openshift-worker-0-bmc-secret
        data:
          username: admin
          password: password
    bareMetalHosts:
      - apiVersion: metal3.io/v1alpha1
        kind: BareMetalHost
        metadata:
          name: openshift-master-0
          labels:
            role: master
        spec:
          bmc:
            address: ipmi://192.168.111.1:6230
            credentialsName: openshift-master-0-bmc-secret
          bootMACAddress: 00:bd:3e:96:4f:3d
          hardwareProfile: default
      - apiVersion: metal3.io/v1alpha1
        kind: BareMetalHost
        metadata:
          name: openshift-master-1
          labels:
            role: master
        spec:
          bmc:
            address: ipmi://192.168.111.1:6231
            credentialsName: openshift-master-1-bmc-secret
          bootMACAddress: 00:bd:3e:96:4f:41
          hardwareProfile: default
      - apiVersion: metal3.io/v1alpha1
        kind: BareMetalHost
        metadata:
          name: openshift-master-2
          labels:
            role: master
        spec:
          bmc:
            address: ipmi://192.168.111.1:6232
            credentialsName: openshift-master-2-bmc-secret
          bootMACAddress: 00:bd:3e:96:4f:45
          hardwareProfile: default
      - apiVersion: metal3.io/v1alpha1
        kind: BareMetalHost
        metadata:
          name: openshift-worker-0
        spec:
          bmc:
            address: ipmi://192.168.111.1:6233
            credentialsName: openshift-worker-0-bmc-secret
          bootMACAddress: 00:bd:3e:96:4f:48
          hardwareProfile: default

@stbenjam
Copy link
Member

My first impression of this is that it's very verbose and repetitive. Credentials separate from the hosts is kind of awkward to handle. It will require a bunch of code in the installer to parse this structure, although maybe there's some we could re-use. From my perspective, the benefit to using something like the CRD is we would have consistent terminology everywhere. What about something flatter?
Just an idea.

platform:
  baremetal:
    hosts:
      - name: openshift-master-0
        role: master
        spec:
          bmc:
            username: admin
            password: password
            address: ipmi://192.168.111.1:6230
          bootMACAddress: 00:bd:3e:96:4f:3d
          hardwareProfile: default
      - name: openshift-master-1
        role: master
        spec:
          bmc:
            username: admin
            password: password
            address: ipmi://192.168.111.1:6231
          bootMACAddress: 00:bd:3e:96:4f:41
          hardwareProfile: default
      - name: openshift-master-2
        role: master
        spec:
          bmc:
            username: admin
            password: password
            address: ipmi://192.168.111.1:6230
          bootMACAddress: 00:bd:3e:96:4f:45
          hardwareProfile: default
      - name: openshift-worker-0
         spec:
          bmc:
            username: admin
            password: password
            address: ipmi://192.168.111.1:6233
          bootMACAddress: 00:bd:3e:96:4f:48
          hardwareProfile: default

@stbenjam
Copy link
Member

stbenjam commented May 22, 2019

If we prefer it to be exact to the CRD specification, could we check with the installer team to get their thoughts? They obviously have the most experience in how the platforms gather their data. Of course by far we have the most information to gather, cloud providers essentially just need a couple of things like credentials/regions.

@russellb
Copy link
Member Author

@stbenjam Thanks for the feedback. Your flattened version is certainly nicer. I think this will come down to deciding on the value of matching the kube objects exactly, or just being similar. I'm thinking we should just start trying to implement it and see if we find using the exact resources makes life easier enough or not. I'm certainly leaning toward your flattened version, though.

@russellb
Copy link
Member Author

For the flattened version, I guess we don't really need spec either.

platform:
  baremetal:
    hosts:
      - name: openshift-master-0
        role: master
        bmc:
          username: admin
          password: password
          address: ipmi://192.168.111.1:6230
        bootMACAddress: 00:bd:3e:96:4f:3d
        hardwareProfile: default
      - name: openshift-master-1
        role: master
        bmc:
          username: admin
          password: password
          address: ipmi://192.168.111.1:6231
        bootMACAddress: 00:bd:3e:96:4f:41
        hardwareProfile: default
      - name: openshift-master-2
        role: master
        bmc:
          username: admin
          password: password
          address: ipmi://192.168.111.1:6230
        bootMACAddress: 00:bd:3e:96:4f:45
        hardwareProfile: default
      - name: openshift-worker-0
        bmc:
          username: admin
          password: password
          address: ipmi://192.168.111.1:6233
        bootMACAddress: 00:bd:3e:96:4f:48
        hardwareProfile: default

@dtantsur
Copy link
Member

  1. Finally, I think for the purposes of this config, Ironic should be considered an implementation detail. I wonder if we should do anything to hide Ironic-isms. For example, I'm looking at driver, management_interface, power_interface, and vendor_interface. Maybe that could be simplified with a single option like managementType: ipmi. We could define additional management types over time that map to a set of options on the backend.

Do we need those other interface values at all? The baremetal operator is only setting the driver.

No, we don't need them if the defaults are good (they are).

@russellb
Copy link
Member Author

A couple minor notes on the last draft flattened version:

  • name should be optional. If specified, it could be used as the name of the corresponding BareMetalHost object that gets created. Otherwise, we could generate a name (perhaps based on the bootMACAddress since that's a unique ID already provided.

  • hardwareProfile should also be optional, as we can default to something.

@stbenjam
Copy link
Member

I think that makes sense. I played around with today, and I'm able to reuse a lot of things from baremetal-operator, like the BMC parsing and it's hardware profiles. Works well.

BUT terraform 0.12 still has weird quirks with nested data structures: you can finally have maps of maps 🎉 , but not maps with maps and strings 😞 like we have here:

      - name: openshift-master-0
        role: master
        bmc:
          username: admin
          password: password
          address: ipmi://192.168.111.1:6230

It might be possible using HCL2 but haven't looked at it yet. I wonder for the first pass if maybe we just use the current structure -- possibly renaming the word node to hosts to avoid confusion, and dealing with the new format later? I'm thinking other than this, we're pretty close to opening a first PR to openshift/installer, not sure if that's too ambitious.

@russellb
Copy link
Member Author

I don't think I understand the limitation. Would something like this be sufficient?

      - name: openshift-master-0
        role: master
        bmc_username: admin
        bmc_password: password
        bmc_address: ipmi://192.168.111.1:6230

We could open a WIP PR without this perhaps, but I really feel like this needs to get done before this is included in a release.

@russellb
Copy link
Member Author

I also don't understand how the install-config.yaml format is limited in any way by what terraform can do. This is input into the installer, not terraform. Is the issue figuring out how we'll pass the data into terraform templates?

@stbenjam
Copy link
Member

I also don't understand how the install-config.yaml format is limited in any way by what terraform can do. This is input into the installer, not terraform. Is the issue figuring out how we'll pass the data into terraform templates?

Er, yes, sorry, you're right. I just had it stuck in my head that with 0.12 we could make it look nicer, and have minimal transformation from the platform config to terraform variables. But, we can do more work in the go side of things and turn it into a structure terraform is happy with (essentially the same flatter layout we have today). It's not really pretty but no one ever sees it. I'll get a PR together with those changes.

@stbenjam stbenjam mentioned this issue May 30, 2019
3 tasks
@stbenjam
Copy link
Member

Have some WIP PR's up, they need the 4.2 rebase first.

Dev-scripts openshift-metal3/dev-scripts#578
Installer #93

This is the install config:

apiVersion: v1beta4
baseDomain: test.metalkube.org
metadata:
  name: ostest
compute:
- name: worker
  replicas: 1
controlPlane:
  name: master
  replicas: 3
platform:
  baremetal:
    api_vip: 192.168.111.5
    hosts:
      - name: openshift-master-0
        role: master
        bmc:
          address: ipmi://192.168.111.1:6230
          username: admin
          password: password
        bootMACAddress: 00:ba:98:e9:69:24
        hardwareProfile: unknown
        deployKernel: http://172.22.0.1/images/ironic-python-agent.kernel
        deployRamdisk: http://172.22.0.1/images/ironic-python-agent.initramfs
      - name: openshift-master-1
        role: master
        bmc:
          address: ipmi://192.168.111.1:6231
          username: admin
          password: password
        bootMACAddress: 00:ba:98:e9:69:28
        hardwareProfile: unknown
        deployKernel: http://172.22.0.1/images/ironic-python-agent.kernel
        deployRamdisk: http://172.22.0.1/images/ironic-python-agent.initramfs
      - name: openshift-master-2
        role: master
        bmc:
          address: ipmi://192.168.111.1:6232
          username: admin
          password: password
        bootMACAddress: 00:ba:98:e9:69:2c
        hardwareProfile: unknown
        deployKernel: http://172.22.0.1/images/ironic-python-agent.kernel
        deployRamdisk: http://172.22.0.1/images/ironic-python-agent.initramfs
    image:
      source: "http://172.22.0.1/images/rhcos-ootpa-latest.qcow2"
      checksum: 2b3b1e19e18627d89da400b63430d5bb
pullSecret: |
    <redacted>

We still need to deal with getting rid of the image URL's

@stbenjam
Copy link
Member

Also, hardware platform is "unknown" as that's the default in BMO: https://github.com/metal3-io/baremetal-operator/blob/master/pkg/hardware/profile.go#L49

Our VM's seem to be using /dev/sda, not vda, so I couldn't use the libvirt profile.

@russellb
Copy link
Member Author

unknown means it's using default. Here we should just default to default.

This looks like a great first step and makes sense as a first PR. I do think we should work out how to drop the URL options too, though.

@stbenjam
Copy link
Member

stbenjam commented May 30, 2019

I guess, is there a reason to just not call it default instead of unknown in BMO? cc: @dhellmann
https://github.com/metal3-io/baremetal-operator/blob/master/pkg/hardware/profile.go#L10

@stbenjam
Copy link
Member

I do think we should work out how to drop the URL options too, though.

I know there was work about some container that would download the images automatically and serve them up, is that something we could run on the bootstrap node? It would be easy for the installer to just use a known path hosted on an http server at $API_VIP.

@dhellmann
Copy link
Member

I guess, is there a reason to just not call it default instead of unknown in BMO? cc: @dhellmann
https://github.com/metal3-io/baremetal-operator/blob/master/pkg/hardware/profile.go#L10

I wanted to be able to tell the difference between the cases where it tried to figure out the profile and couldn't and it used the default profile because that matched. When we get the profile matching doing more useful stuff, I expect we'll revisit that UX.

@dhellmann
Copy link
Member

I do think we should work out how to drop the URL options too, though.

I know there was work about some container that would download the images automatically and serve them up, is that something we could run on the bootstrap node? It would be easy for the installer to just use a known path hosted on an http server at $API_VIP.

The installer shouldn't set the image URLs on the host objects at all. For the control plane we want to register them as "externally provisioned" to avoid driving them through the provisioning process again. For workers we want to let the actuator deal with provisioning them based on the number of worker nodes requested.

Something does need to set the URL on the MachineSet and MachineDeployment. I know @sadasu is working on that in another PR for the machine-api-operator integration, and we weren't sure where that URL would come from.

@stbenjam
Copy link
Member

I do think we should work out how to drop the URL options too, though.

I know there was work about some container that would download the images automatically and serve them up, is that something we could run on the bootstrap node? It would be easy for the installer to just use a known path hosted on an http server at $API_VIP.

The installer shouldn't set the image URLs on the host objects at all. For the control plane we want to register them as "externally provisioned" to avoid driving them through the provisioning process again. For workers we want to let the actuator deal with provisioning them based on the number of worker nodes requested.

Something does need to set the URL on the MachineSet and MachineDeployment. I know @sadasu is working on that in another PR for the machine-api-operator integration, and we weren't sure where that URL would come from.

Got it, we won't pass that info through. The URL's I'm looking at here are the ones currently defined in the install-config that we give to terraform to provision the nodes. We'd like the installer to derive them from information it knows already, without the user having to tell us where they are.

@russellb
Copy link
Member Author

I do think we should work out how to drop the URL options too, though.

I know there was work about some container that would download the images automatically and serve them up, is that something we could run on the bootstrap node? It would be easy for the installer to just use a known path hosted on an http server at $API_VIP.

The installer shouldn't set the image URLs on the host objects at all. For the control plane we want to register them as "externally provisioned" to avoid driving them through the provisioning process again. For workers we want to let the actuator deal with provisioning them based on the number of worker nodes requested.
Something does need to set the URL on the MachineSet and MachineDeployment. I know @sadasu is working on that in another PR for the machine-api-operator integration, and we weren't sure where that URL would come from.

Got it, we won't pass that info through. The URL's I'm looking at here are the ones currently defined in the install-config that we give to terraform to provision the nodes. We'd like the installer to derive them from information it knows already, without the user having to tell us where they are.

It seems we have a couple of options. Either the images need to be served from the bootstrap VM, or from the provisioning host running the installer. In either case, the installer should be able to download the correct images (both RHCOS and IPA) on its own.

@russellb
Copy link
Member Author

russellb commented Jun 4, 2019

Follow-ups after the main refactoring that was already completed. Two more things to be removed:

#58 tracks removing the kernel and ramdisk URLs from the install config.

#37 for getting rid of the RHCOS image config in install-config

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants