This mono repository contains everything I use to setup and run the devices in my home. It is based off the awesome cluster-template.
It is fully managed following GitOps practices and using tools like Ansible, Kubernetes, Flux, Renovate, and GitHub Actions.
Device | Count | Storage | Purpose |
---|---|---|---|
Protectli FW4B clone | 1 | 120GB | OPNsense router |
Synology NAS | 1 | 12TB RAID 5 + 2TB RAID 1 | Main storage |
Intel NUC8i5BEH | 3 | 120GB SSD + 500GB NVMe | Kubernetes control planes + storage |
Intel NUC8i3BEH | 2 | 120GB SSD | Kubernetes workers |
Raspberry Pi 3 | 2 | 16GB SD | Unifi Controller / 3D Printer with OctoPrint |
Intel NUC bios can now be found on Asus support.
Configuration on top of Defaults (F9):
- Devices > Onboard Devices > Onboard Device Configuration
- Uncheck
WLAN
- Uncheck
Bluetooth
- Cooling > CPU Fan Header
- Uncheck
Fan off capability
- Uncheck
- Power > Secondary Power Settings
- Set
After Power Failure
toLast State
- Set
- Boot > Boot Configuration > Boot Display Config
- Check
Display F12 for Network Boot
- Check
In addition, to install Talos Linex with secure boot, we need to allow enrolling other keys. Enrolling new keys is done by booting the ISO and selecting the appropriate option.
- Boot > Secure Boot > Secure Boot Config
- Check
Clear Secure Boot Data
- Check
There is a boot menu that can be helpful in case of boot failures:
Press and hold down the power button for three seconds, then release it before the 4 second shutdown override. The Power Button Menu displays. (Options on the menu can vary, depending on the Intel NUC model.) Press F7 to start the BIOS update.
The fans on the Intel NUC are known to wear off. In case of overheating this is likely the issue. Amazon and Youtube are your best friends.
The CMOS battery can die and need replacing. Symptoms are the NUC not powering on at all.
In addition to the regular things like a firewall, my router runs other useful stuff.
I have Talos configured with a Virtual IP to provide HA over the control nodes' API server but I also use HAProxy as loadbalancer.
First, create a Virtual IP to listen on:
- Interfaces > Virtual IPs > Settings > Add
Mode
=IP Alias
Interface
=SERVER
(my VLAN for k8s nodes)Network / Address
=10.0.3.2/32
Description
=k8s-apiserver
Then, create the HAProxy configuration:
- Services > HAProxy | Real Servers (for each master node)
Enabled
=true
Name or Prefix
=k8s-node-x-apiserver
FQDN or IP
=k8s-node-x
Port
=6443
Verify SSL Certificate
=false
- Services > HAProxy | Rules & Checks > Health Monitors
Name
=k8s-apiserver
SSL preferences
=Force SSL for health checks
Port to check
=6443
HTTP method
=GET
Request URI
=/healthz
HTTP version
=HTTP/1.1
- Services > HAProxy | Virtual Services > Backend Pools
Enabled
=true
Name
=k8s-apiserver
Mode
=TCP (Layer 4)
Servers
=k8s-node-x-apiserver
(Add one for each real server you created)Enable Health Checking
=true
Health Monitor
=k8s-apiserver
- Services > HAProxy | Virtual Services > Public Services
Enabled
=true
Name
=k8s-apiserver
Listen Addresses
=10.0.3.2:6443
(the Virtual IP created above, alternatively, the router IP)Type
=TCP
Default Backend Pool
=k8s-apiserver
- Services > HAProxy | Settings > Service
Enable HAProxy
=true
Note that Health Monitors require anonymous-auth
to be enabled on Talos, otherwise we need to rely on TCP health checks instead.
Cilium is configured with BGP to advertise load balancer IPs directly over BGP. Coupled with ECMP, this allows to spread workload in my cluster.
- Routing > BPG | General
enable
=true
BGP AS Number
=64512
Network
=10.0.3.0/24
(Subnet of Kubernetes nodes)- Save
- Routing > BGP | Neighbors
- Add a neighbor for each Kubernetes node
Enabled
=true
Peer-IP
=10.0.3.x
(Kubernetes node IP)Remote AS
=64512
Update-Source Interface
=SERVER
(VLAN of Kubernetes nodes)- Save
- Continue adding neighbors until all your nodes are present
- Add a neighbor for each Kubernetes node
- Routing > General
Enable
=true
- Save
- System > Settings > Tunables
- Add
net.route.multipath
and set the value to1
- Save
- Add
- Reboot
- Verify
- Routing > Diagnostics > BGP | Summary
To be able to send emails from my local devices easily without authentication, I run the Postfix plugin with the following configuration:
- System > Services > Postfix > General
Enable
=true
Trusted Networks
+=10.0.0.0/8
TLS Wrapper Mode
=true
SMTP Client Security
=encrypt
Smart Host
=[smtp.purelymail.com]:465
Enable SMTP Authentication
=true
Authentication Username
=admin@<email-domain>
Authentication Password
=<app-password>
Permit SASL Authenticated
=false
- Save
- System > Services > Postfix > Domains
- Add new domain
Domainname
=<email-domain>
Destination
=[smtp.purelymail.com]:465
- Save
- Apply
- Add new domain
- System > Services > Postfix > Senders
- Add new sender
Enabled
=true
Sender Address
=admin@<email-domain>
- Save
- Apply
- Add new sender
- Verify
swaks --server 10.0.3.1 --port 25 --to <email-address> --from <email-address>
It can happen that a node is corrupt and fails to start, destroying it will trigger the creation of a new one. Run this command replacing {x} with the instance to destroy.
kubectl cnpg -n database destroy postgres16 {x}
It's sometimes useful to make some edits in a PVC or change permissions.
task kubernetes:browse-pvc ns=media claim=jellystat
I learned a lot from the people that have shared their clusters over at kubesearch and from the Home Operations Discord Community.
Want to get started? I highly recommend that you take a look at the cluster-template repository!