Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pulumi failing to detect IP although successfully configured by cloud-init. #182

Open
mkennedy85 opened this issue Sep 29, 2021 · 3 comments
Labels
kind/bug Some behavior is incorrect or out of spec

Comments

@mkennedy85
Copy link

Hello!

  • Vote on this issue by adding a 👍 reaction
  • To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already)

Issue details

When cloning a VM and using the cloud-init DatasourceVMwareGuestinfo datasource, Pulumi is unable to detect that an IP has been successfully set and the network is up, so it never completes and returns the error:

  vsphere:index:VirtualMachine (app01-harvest-dev):
    error: 1 error occurred:
    	* creating urn:pulumi:lab-deploy::vsphere-pulumi::vsphere:index/virtualMachine:VirtualMachine::app01-harvest-dev: 1 error occurred:
    	* timeout waiting for an available IP address

This is not the case when using the OVF customization, as Pulumi reports that the VM has been configured and succeeds.

Steps to reproduce

  1. Create a VirtualMachine resource (in this case it is RHEL) and pass the guestinfo for cloud-init to detect and configure:
userdata = f'''#cloud-config
disable_root: false
ssh_passwauth: True
growpart:
  mode: auto
  devices: ['/dev/sda']
write_files:
- path: /root/CLOUD_INIT_WAS_HERE
  content: |
    test
'''

userdata_bytes = userdata.encode('ascii')
userdata_base64 = base64.b64encode(userdata_bytes)
userdata_message = userdata_base64.decode('ascii')

metadata = f'''# cloud-init
instance-id: app01-harvest-dev
local-hostname: app01-harvest-dev
network:
  version: 2
  ethernets:
    ens192:
      match:
        name: en*
      dhcp4: false
      addresses: [10.179.177.164/32]
      gateway4: 10.179.176.1
      nameservers:
          search: [apple.com, retailtech.apple.com]
          addresses: [10.7.7.7,10.8.8.8]
'''

metadata_bytes = metadata.encode('ascii')
metadata_base64 = base64.b64encode(metadata_bytes)
metadata_message = metadata_base64.decode('ascii')

pulumi_vsphere.VirtualMachine("app01-harvest-dev",
        name="app01-harvest-dev",
        resource_pool_id=resource_pool.resource_pool_id,
        datastore_id=datastore.id,
        num_cpus=2,
        memory=4096,
        guest_id=template.guest_id,
        scsi_type = template.scsi_type,
        extra_config = {
            'guestinfo.userdata': userdata_message,
            'guestinfo.userdata.encoding': 'base64',
            'guestinfo.metadata': metadata_message,
            'guestinfo.metadata.encoding': 'base64',
        },
        firmware=template.firmware,
        network_interfaces=[
            {
                'networkId': network.id,
                'adapter_type': template.network_interface_types[0],
            }
        ],
        disks=[
            {
                'label': 'disk0',
                'size': template.disks[0]['size'],
                'thin_provisioned': template.disks[0]['thin_provisioned'],
                'eagerly_scrub': template.disks[0]['eagerly_scrub'],
            }
        ],
        annotation="Configured by Pulumi",
        clone={
            'templateUuid': template.id,
        },
    )
  1. Run pulumi up -y
  2. Pulumi runs and results in:
Previewing update (lab-deploy)

     Type                             Name                           Plan
 +   pulumi:pulumi:Stack              vsphere-pulumi-lab-deploy  create
 +   └─ vsphere:index:VirtualMachine  app01-harvest-dev   create

Resources:
    + 5 to create

Updating (lab-deploy)

     Type                             Name                           Status                  Info
 +   pulumi:pulumi:Stack              vsphere-pulumi-lab-deploy  **creating failed**     1 error
 +   └─ vsphere:index:VirtualMachine  app01-harvest-dev   **creating failed**     1 error

Diagnostics:
  pulumi:pulumi:Stack (vsphere-pulumi-lab-deploy):
    error: update failed

  vsphere:index:VirtualMachine (app01-harvest-dev):
    error: 1 error occurred:
    	* creating urn:pulumi:lab-deploy::vsphere-pulumi::vsphere:index/virtualMachine:VirtualMachine::app01-harvest-dev: 1 error occurred:
    	* timeout waiting for an available IP address

Resources:
    + 4 created

Duration: 7m20s
  1. Confirm that in vSphere the VM has an IP address. (It does have an IP and the network is up and reachable despite Pulumi reporting otherwise)

Expected: Pulumi to complete successfully and provide the IP as an output.
Actual: The VM is configured as expected, but Pulumi does not return successfully due to not detecting that vmware-tools detects an IP. vSphere does successfully show an IP address in the vSphere UI, even before Pulumi fails, so it does appear that vmware-tools is getting an IP, but this is not being communicated back to Pulumi.

@mkennedy85 mkennedy85 added the kind/bug Some behavior is incorrect or out of spec label Sep 29, 2021
@clstokes
Copy link

clstokes commented Oct 4, 2021

@mkennedy85 per hashicorp/terraform-provider-vsphere#718 (comment) (from the upstream provider) it seems that different OS's populate the IP address into different paths on the VM - guest.ipAddress vs summary.guest.ipAddress vs guest.net.{nic}.ipConfig.ipAddress. What path are you seeing populated with the IP?

Depending on that, it might be necessary to set waitForGuestIpTimeout=0 on your VirtualMachine to tell the provider to skip this check/wait. Can you try that?

@mkennedy85
Copy link
Author

@clstokes I did actually move ahead to configuring it with wait_for_guest_net_timeout=0 in order to be able to ignore this. Seeing as I am manually configuring the IP, it is not a problem to handle it this way. I appreciate you taking a look at it.

One thing to note is that it works if you use the default OVF datasource for guest customization, but using the guest info for user-data with cloud-init, seems to respond differently. I suspect there are cloud-init processes that take over, depending on the customization, and VMware tools is no longer sending info back to the processes that started the clone operation. I may be wrong, but this is a thought.

@nhi-vanye
Copy link

nhi-vanye commented Nov 5, 2022

I'm not sure this is a pulumi issue, I've had all sorts of problems doing this with terraform.

You can solve it for a specific OS/template configuration with careful guestinfo settings, but I have not managed to build a generic VM provisioning system yet.

Seems that a lot of the issues I had were because I wanted static IP addresses for my created VMs

Here's my workflow just in case anyone finds this useful - there's obviously a bunch of underlying utility code not included here (for example the actual config data which is created using mako templates (python))

Import a standard Ubuntu OVA and enable guestinfo

#! /usr/bin/env python3

import logging
import subprocess

import pulumi

def import_ova():

    args = [
            "govc",
            "import.ova",
            "-ds=VMs",
            "-k",
            "-name=ubuntu-20.04-server-cloudimg",
            "https://cloud-images.ubuntu.com/releases/focal/release/ubuntu-20.04-server-cloudimg-amd64.ova"
        ]


    logging.getLogger("").error( f'Executing {" ".join( args )} ...' )

    process = subprocess.Popen(
        args,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE)

    stdout, stderr = process.communicate()
    stdout, stderr

def enable_guestinfo():

    args = [
            "govc",
            "vm.change",
            "-k",
            "-e guestinfo.dummy=true"
            "-vm=ubuntu-20.04-server-cloudimg"
        ]


    logging.getLogger("").error( f'Executing {" ".join( args )} ...' )

    process = subprocess.Popen(
        args,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE)

    stdout, stderr = process.communicate()
    stdout, stderr

def mark_as_template():

    args = [
            "govc",
            "vm.markastemplate",
            "-k",
            "ubuntu-20.04-server-cloudimg"
        ]


    logging.getLogger("").error( f'Executing {" ".join( args )} ...' )

    process = subprocess.Popen(
        args,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE)

    stdout, stderr = process.communicate()
    stdout, stderr



def main():

    #import_ova()

    enable_guestinfo()

    mark_as_template()


if __name__ == "__main__":

    main()

Provision VMs from the above template

import logging

import pulumi
import pulumi_vsphere as vsphere
import pulumi_cloudinit as cloudinit

import randomname


config = pulumi.Config()

def random_vm_name():

    return randomname.get_name()

def get_vmware_env():
    env = {}

    env["dc"] = vsphere.get_datacenter(name=config.require("datacenter"))

    env["vsphere_host"]  = vsphere.get_host(name=config.require("host"),datacenter_id=env['dc'].id)

    env['vm_datastore'] = vsphere.get_datastore(name=config.require("vmDatastore"), datacenter_id=env['dc'].id)

    env['iso_datastore'] = None

    if config.get('isoDatastore'):
        env['iso_datastore'] = vsphere.get_datastore(name=config.get('isoDatastore'), datacenter_id=env['dc'].id)

    env['pool'] = vsphere.get_resource_pool( name=f"{config.require('cluster')}/Resources", datacenter_id=env['dc'].id)

    env['port_group'] = vsphere.get_network(name=config.require('portGroup'), datacenter_id=env['dc'].id)

    env['template'] = vsphere.get_virtual_machine(name=config.require('template'), datacenter_id=env['dc'].id)

    return env



def create(this_node=None, all_nodes=[], config_data={}):

    env = get_vmware_env()

    vm_name = this_node.get('name', randomname.get_name() )

    extra_config={}

    if "userdata" in config_data:
        extra_config['guestinfo.userdata'] = config_data['userdata']
        extra_config['guestinfo.userdata.encoding'] = "gzip+base64"

    if "metadata" in config_data:
        extra_config['guestinfo.metadata'] = config_data['metadata']
        extra_config['guestinfo.metadata.encoding'] = "gzip+base64"

    vm = vsphere.VirtualMachine(
            vm_name,
            None,
            num_cpus = this_node.get('cpu', 1),
            memory = this_node.get('ram', 4096),
            resource_pool_id = env['pool'].id,
            datastore_id = env['vm_datastore'].id,
            guest_id = env['template'].guest_id,
            network_interfaces = [
                {
                    "network_id": env['port_group'].id,
                    "adapter_type": "vmxnet3"
                }
            ],
            disks = [
                {
                    "label": "disk0",
                    "size": this_node.get('disk', 16),
                    "eagerly_scrub": False,
                    "thin_provisioned": True,
                }
            ],
            clone = {
                "templateUuid" : env['template'].id
            },
            # we need a CDROM device to access cloud-config
            cdrom = {"client_device": True},
            extra_config = extra_config,
            vapp = vsphere.VirtualMachineVappArgs(
                properties={ }
            )
        )
    return vm

This eventually bubbles up into a top level Pulumi project for creating VMs to build a k8s cluster....

"""A Python Pulumi program"""
import logging

import pulumi

from infra import vsphere as PLATFORM

from k8s.config import nodes as NODES
from k8s.config import get_cloud_configs


for node in NODES.nodes:
    logging.getLogger("k8s").info(f"{node['name']} ({node['type']})")

    configs = get_cloud_configs(this_node=node, all_nodes=NODES.nodes)

    PLATFORM.vm.create( this_node=node, all_nodes=NODES.nodes, config_data=configs )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Some behavior is incorrect or out of spec
Projects
None yet
Development

No branches or pull requests

4 participants