Skip to content
This repository has been archived by the owner on Nov 20, 2023. It is now read-only.

Libvirt pre install step (2/4) #353

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
179 changes: 179 additions & 0 deletions docs/PROVISIONING_LIBVIRT.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
= OpenShift on Libvirt using CASL
:MYWORKDIR: ~/src
// FIXME: how to get variables rendered in code blocks?

== Introduction

The aim of this setup is to get a flexible OpenShift installation which as little intrusive on the host as possible, under the assumption that a Libvirt installation will mostly be used on laptops or workstations, which also need to continue working well for other purposes.

CAUTION: THIS DOCUMENT AND THE ARTEFACTS PERTAINING TO IT ARE STILL UNDER _HEAVY_ DEVELOPMENT!!!

== Control Host Setup (one time, only)

NOTE: These steps are a canned set of steps serving as an example, and may be different in your environment.

Before getting started following this guide, you'll need the following:

FIXME:: address docker installation and usage at a later stage.

* Docker installed
** RHEL/CentOS: `yum install -y docker`
** Fedora: `dnf install -y docker`
** **NOTE:** If you plan to run docker as yourself (non-root), your username must be added to the `docker` user group.

* Ansible 2.7 or later installed
** link:https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html[See Installation Guide]
* python3-libvirt and/or python2-libvirt

[source,bash]
----
cd {MYWORKDIR}/
git clone https://github.com/redhat-cop/casl-ansible.git
----

* Run `ansible-galaxy` to pull in the necessary requirements for the CASL provisioning of OpenShift:

NOTE: The target directory ( `galaxy` ) is **important** as the playbooks know to source roles and playbooks from that location.

[source,bash]
----
cd {MYWORKDIR}/casl-ansible
ansible-galaxy install -r casl-requirements.yml -p galaxy
----

== Libvirt setup

The following needs to be set up on your Libvirt server before you can start:

=== Setup a local dnsmasq

Create a dummy network interface:

------------------------------------------------------------------------
sudo modprobe dummy
sudo ip link add dummy0 type dummy
sudo ip address add 192.168.123.254 dev dummy0 # <1>
sudo ip address show dev dummy0 up
------------------------------------------------------------------------
<1> the IP-address must be the one you've entered as forwarder for the apps wildcard DNS in your network XML.

Start dnsmasq against this interface, defining our wildcard DNS domain *.apps.local

------------------------------------------------------------------------
sudo dnsmasq --interface=dummy0 --no-daemon --log-queries=extra \
--bind-interfaces --clear-on-reload \
--address=/apps.local/192.168.123.123 # <1>
------------------------------------------------------------------------
<1> the IP-address must be the one of the VM where the OCP route will be running, and the domain must of course be the one configured as apps wildcard.

NOTE: the dnsmasq is hence only running on-demand but as it's the case of my OpenShift cluster as well, no big deal.

CAUTION: I guess I had already opened the firewall accordingly and integrated beforehand Satellite 6 with my Libvirtd (e.g. `LIBVIRTD_ARGS="--listen"` in `/etc/sysconfig/libvirtd`, so there might be more than the above to it.

=== Create a separate network

Call `sudo virsh net-create --file libvirt-network-definition.xml`

CAUTION: the network definition isn't persistent (on purpose for a start) and needs to be repeated before each start.

TODO:: continue description !!!

Cool! Now you're ready to provision OpenShift clusters on Libvirt.

== Provision an OpenShift Cluster

As an example, we'll provision the `sample.libvirt.example.com` cluster defined in the `{MYWORKDIR}/casl-ansible/inventory` directory.

NOTE: Unless you already have a working inventory, it is recommended that you make a copy of the above mentioned sample inventory and keep it somewhere outside of the casl-ansible directory. This allows you to update/remove/change your casl-ansible source directory without losing your inventory. Also note that it may take some effort to get the inventory just right, hence it is very beneficial to keep it around for future use without having to redo everything.

FIXME:: the instructions are written _for now_ step by step and running locally on the libvirt host as root. This might/should change in the future but this is the current state of the implementation. Each sub-chapter is called after the playbook step constituting the end-to-end playbook.


=== provision-instances

- make sure `/dev/loopN` isn't mounted on `/var/www/html/installXXX`, and remove it from your `/etc/fstab` if you try multiple times with errors (something to FIXME).
- copy and adapt the sample directory with files and inventory:
* adapt the Libvirt specific parameters to make them compatible with your setup (especially the network)
- export the variable `LIBVIRT_INV_VM_FILTER` to fit the libvirt names defined for your cluster's VMs, e.g. `export LIBVIRT_INV_VM_FILTER=^ocp_`.
- if your network isn't persistent create it (see above).
- make sure that `/tmp/authorized_keys` exists. FIXME: not sure yet for which use cases it is required, I just copy for now my own authorized keys.
- call `ansible-playbook -i ../../inventory/sample.libvirt.example.com.d/inventory libvirt/provision.yml`.
+
IMPORTANT: virt-install is only running synchronously because a virt-viewer UI is popping up. Close each virt-viewer once the corresponding installation has happened and not too long after.
+
- identify the IP address of the infrastructure VM on which the route will run and start accordingly the separate dnsmasq responsible to do the wildcard DNS resolution (see above).
- login into one of the new VMs and validate that DNS is working correctly:
* `dig master.local` gives the correct IP address (same for all VMs)
* `dig -x <master-ip>` works as well
* `dig -x xxx.apps.local` gives the IP of the route/infranode.

NOTE: up till now, I've worked as root to avoid complications. From here on, I'm working again as normal user on the control host.

=== pre-install

IMPORTANT: you need to have ssh-ed once to each node to make sure that their SSH-signature is already in your known_hosts file.

Things to consider:

- make sure the above preparations are still active (network, DNS, environment variables)
- define the environment variables `RHSM_USER` and `RHSM_PASSWD` or use an activation key (TODO describe activation key / Satellite 6 approach).
+
CAUTION: because there is no trace of OpenShift on the system, it is relatively sure that auto-attach will fail. Hence make sure `rhsm_pool` or `rhsm_pool_ids` are defined in the inventory (or on the command line).

Then call `ansible-playbook -i ../../inventory/sample.libvirt.example.com.d/inventory/ pre-install.yml -e rhsm_pool='^{POOL_NAME}$'`.

TODO:: continue to adapt / complete the following lines for Libvirt

Run the `end-to-end` provisioning playbook via our link:../images/casl-ansible/[??? installer container image].

[source,bash]
----
docker run -u `id -u` \
-v $HOME/.ssh/id_rsa:/opt/app-root/src/.ssh/id_rsa:Z \
-v $HOME/src/:/tmp/src:Z \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
-e INVENTORY_DIR=/tmp/src/casl-ansible/inventory/sample.libvirt.example.com.d/inventory \
-e PLAYBOOK_FILE=/tmp/src/casl-ansible/playbooks/openshift/end-to-end.yml \
-e OPTS="-e libvirt_key_name=my-key-name" -t \
quay.io/redhat-cop/casl-ansible
----

NOTE: The above bind-mounts will map files and source directories to the correct locations within the control host container. Update the local paths per your environment for a successful run.

NOTE: Depending on the SELinux configuration on your OS, you may or may not need the `:Z` at the end of the volume mounts.

Done! Wait till the provisioning completes and you should have an operational OpenShift cluster. If something fails along the way, either update your inventory and re-run the above `end-to-end.yml` playbook, or it may be better to [delete the cluster](https://github.com/redhat-cop/casl-ansible#deleting-a-cluster) and re-start.

== Updating a Cluster

Once provisioned, a cluster may be adjusted/reconfigured as needed by updating the inventory and re-running the `end-to-end.yml` playbook.

== Scaling Up and Down

A cluster's Infra and App nodes may be scaled up and down by editing the following parameters in the `all.yml` file and then re-running the `end-to-end.yml` playbook as shown above.

[source,yaml]
----
appnodes:
count: <REPLACE WITH NUMBER OF INSTANCES TO CREATE>
infranodes:
count: <REPLACE WITH NUMBER OF INSTANCES TO CREATE>
----

== Deleting a Cluster

A cluster can be decommissioned/deleted by re-using the same inventory with the `delete-cluster.yml` playbook found alongside the `end-to-end.yml` playbook.

[source,bash]
----
docker run -it -u `id -u` \
-v $HOME/.ssh/id_rsa:/opt/app-root/src/.ssh/id_rsa:Z \
-v $HOME/src/:/tmp/src:Z \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
-e INVENTORY_DIR=/tmp/src/casl-ansible/inventory/sample.casl.example.com.d/inventory \
-e PLAYBOOK_FILE=/tmp/src/casl-ansible/playbooks/openshift/delete-cluster.yml \
-e OPTS="-e libvirt_key_name=my-key-name" -t \
quay.io/redhat-cop/casl-ansible
----
74 changes: 74 additions & 0 deletions docs/TODO_LIBVIRT.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
= Improvements and corrections for Libvirt as infra

The following isn't a decision of things to do, just a rather random list of things that might (or might not) end in future version of the Libvirt integration. It is just easier to document those things as I work on the topic, rather than creating individual issues that might never be addressed.

== Make the VM provisioning purely local

It seems we can avoid spinning a web server to provide the ISO content by using a command line like the following:

------------------------------------------------------------------------
virt-install \
--name rhel7.5-Ceph-test-singlenode \
--os-variant rhel7 \
--initrd-inject=/home/pcfe/work/git/HouseNet/kickstart/RHEL75-Ceph-singlenode-ks.cfg \
--location /mnt/ISO_images/rhel-server-7.5-x86_64-dvd.iso \
--extra-args "ks=file:/RHEL75-Ceph-singlenode-ks.cfg console=ttyS0,115200" \
--ram 2048 \
--disk pool=default,boot_order=1,format=qcow2,bus=virtio,discard=unmap,sparse=yes,size=10 \
--disk pool=default,boot_order=2,format=qcow2,bus=virtio,discard=unmap,sparse=yes,size=5 \
--disk pool=default,boot_order=3,format=qcow2,bus=virtio,discard=unmap,sparse=yes,size=5 \
--disk pool=default,boot_order=4,format=qcow2,bus=virtio,discard=unmap,sparse=yes,size=5 \
--controller scsi,model=virtio-scsi \
--rng /dev/random \
--boot useserial=on \
--vcpus 1 \
--cpu host \
--nographics \
--accelerate \
--network network=default,model=virtio
------------------------------------------------------------------------

This would probably be a less intrusive and resource intensive approach.

== Make the dynamic inventory script more flexible

Could be useful to make it easier to add/remove nodes without configuring multiple places in the (static) inventory. Two ideas:

- either add metadata to the created VMs (but how does it work exactly?).
- or use the description field to pack information, e.g. using an ini or json format (quoting might be an issue here).

== Split the inventory in infra and cluster inventories

I'm thinking that it would be much more flexible and easier to maintain to split the inventory in a "cluster type" and an "infra type" and combine them with multiple -i options, .e.g -i libvirt_inv/ -i 3_nodes_cluster_inv/.

Just an idea at this stage and I'm not sure it's easily possible to get the expected flexiblity, but with the right dynamic script, it might be feasible.

== Make the Libvirt inventory more robust

. the inventory shouldn't fail if title or description is missing

== Improve the playbooks / roles using ideas from others

Following sources have been identified and could be used:

- https://github.com/nmajorov/libvirt-okd
- https://docs.google.com/document/d/1Mbd2v6j3AQlbiY_zbZF5fWWwXvenQf8EDw7oW5907Hs/edit?usp=drivesdk

== Improve the CONTRIBUTE_PROVISIONER.md

Just taking notes for future improvements as I learn myself through casl, feel free to already review, I'll add them to the already document once I'm finished here:

NOTE: `{provisioner}` and `{PROVISIONER}` stand for your provisioner (e.g. libvirt etc.) either in capitals or not.

- create following directories in the `casl-ansible` repository:
* `playbooks/openshift/{provisioner}`
* `inventory/sample.{provisioner}.example.com.d/inventory` (and optionally `files` or others)
- create a playbook which creates VMs from your provisioner as `playbooks/openshift/{provisioner}/provision.yml` and make sure it is called from `playbooks/openshift/provision-instances.yml` based on the variable `hosting_infrastructure` set to your provisioner.
* re-use as much as possible roles from the `infra-ansible` repo or add new generic roles that support your infrastructure provider independently from `casl-ansible`.
- create a sample inventory respecting following requirements:
* it respects the usual OpenShift inventory settings and makes sure that the nodes created during the provisioning phase are neatly put into the right groups.
* this most probably pre-requisites that there is a dynamic inventory script created, which pulls the information about the VMs from the provisioner. This script is created into `inventory/scripts/` and linked into `inventory/sample.{provisioner}.example.com.d/inventory` (FIXME: why this complexity?). Your script must especially make sure that `ansible_host` is defined so that the Ansible connection isn't relying on the name in the inventory, which is recommended to be independent from the provisioning.
* it defines following variables required during the next steps of the end-to-end process:
** `hosting_infrastructure` set to `{provisioner}`.
** `docker_storage_block_device` e.g. `/dev/vdb` (implying that each created VM has two disks, one being reserved for Docker).
- create a document `docs/PROVISIONING_{PROVISIONER}.adoc` explaining how to adapt and use your provisioner. Some notes about the considerations you've made during implementation is surely not a bad idea.
37 changes: 37 additions & 0 deletions inventory/sample.libvirt.example.com.d/files/ks/appnode.ks
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
install
lang en_US.UTF-8
keyboard --vckeymap=us --xlayouts='us'
firstboot --enable
auth --enableshadow --passalgo=sha512
services --enabled=chronyd
eula --agreed
reboot

# network
network --bootproto=dhcp --device=eth0 --noipv6 --activate --hostname=appnode.local

# System timezone
timezone US/Eastern --isUtc

# Disks
bootloader --location=mbr --boot-drive=vda
ignoredisk --only-use=vda
zerombr
clearpart --all --initlabel --drives=vda
part /boot/efi --fstype="vfat" --size=200 --ondisk=vda
part /boot --fstype="ext2" --size=512 --ondisk=vda --asprimary
part pv.10 --fstype="lvmpv" --size=1 --grow --ondisk=vda

# LVMs
volgroup vg1 pv.10
logvol / --fstype=xfs --name=root --vgname=vg1 --size=1 --grow
logvol swap --fstype=swap --size=2048 --vgname=vg1

rootpw --plaintext redhat

%packages
@base
net-tools
wget

%end
37 changes: 37 additions & 0 deletions inventory/sample.libvirt.example.com.d/files/ks/infranode.ks
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
install
lang en_US.UTF-8
keyboard --vckeymap=us --xlayouts='us'
firstboot --enable
auth --enableshadow --passalgo=sha512
services --enabled=chronyd
eula --agreed
reboot

# network
network --bootproto=dhcp --device=eth0 --noipv6 --activate --hostname=infranode.local

# System timezone
timezone US/Eastern --isUtc

# Disks
bootloader --location=mbr --boot-drive=vda
ignoredisk --only-use=vda
zerombr
clearpart --all --initlabel --drives=vda
part /boot/efi --fstype="vfat" --size=200 --ondisk=vda
part /boot --fstype="ext2" --size=512 --ondisk=vda --asprimary
part pv.10 --fstype="lvmpv" --size=1 --grow --ondisk=vda

# LVMs
volgroup vg1 pv.10
logvol / --fstype=xfs --name=root --vgname=vg1 --size=1 --grow
logvol swap --fstype=swap --size=2048 --vgname=vg1

rootpw --plaintext redhat

%packages
@base
net-tools
wget

%end
37 changes: 37 additions & 0 deletions inventory/sample.libvirt.example.com.d/files/ks/master.ks
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
install
lang en_US.UTF-8
keyboard --vckeymap=us --xlayouts='us'
firstboot --enable
auth --enableshadow --passalgo=sha512
services --enabled=chronyd
eula --agreed
reboot

# network
network --bootproto=dhcp --device=eth0 --noipv6 --activate --hostname=master.local

# System timezone
timezone US/Eastern --isUtc

# Disks
bootloader --location=mbr --boot-drive=vda
ignoredisk --only-use=vda
zerombr
clearpart --all --initlabel --drives=vda
part /boot/efi --fstype="vfat" --size=200 --ondisk=vda
part /boot --fstype="ext2" --size=512 --ondisk=vda --asprimary
part pv.10 --fstype="lvmpv" --size=1 --grow --ondisk=vda

# LVMs
volgroup vg1 pv.10
logvol / --fstype=xfs --name=root --vgname=vg1 --size=1 --grow
logvol swap --fstype=swap --size=2048 --vgname=vg1

rootpw --plaintext redhat

%packages
@base
net-tools
wget

%end
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
<network ipv6='no'>
<name>ocp-network</name>
<forward mode='nat'/>
<bridge name='virbr3' stp='on' delay='0' />
<domain name='local' localOnly='no' />
<dns>
<forwarder domain='apps.local' addr='192.168.123.254'/>
</dns>
<ip address='192.168.123.1' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.123.1' end='192.168.123.250'/>
</dhcp>
</ip>
</network>
Loading