-
Hi all, I'm testing a very basic clone of this playbook, with a few basics changed. The error I'm seeing is this. It seems the Jinja templating is breaking at
I can confirm that the kube-vip instance is running and the script fails due to the issue above. |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments
-
Dug a bit deeper and the issue is elsewhere, this is on one of the master nodes:
|
Beta Was this translation helpful? Give feedback.
-
Hi can you please fill out the issue template that was supplied when you created an issue? Thank you! |
Beta Was this translation helpful? Give feedback.
-
Expected BehaviorAccording to the YouTube video, at least, your master nodes joined the main node which runs Current BehaviorThis does not happen, instead the 2nd and 3rd master nodes are unable to connect to the main (primary) master node as CA certs are missing.
Steps to ReproduceRun the playbook by default, this error should take place. Context (variables)Operating system: Debian 11 Hardware: VM: 16GB RAM / 2vcpu / 40GB disk Variables Used
k3s_version: v1.24.10+k3s1
ansible_user: NA
systemd_dir: /etc/systemd/system
# interface which will be used for flannel
flannel_iface: "eth0"
# apiserver_endpoint is virtual ip-address which will be configured on each master
apiserver_endpoint: "10.0.3.85"
k3s_token: "NA"
# these arguments are recommended for servers as well as agents:
extra_args: >-
--flannel-iface={{ flannel_iface }}
--node-ip={{ k3s_node_ip }}
# change these to your liking, the only required are: --disable servicelb, --tls-san {{ apiserver_endpoint }}
extra_server_args: >-
{{ extra_args }}
{{ '--node-taint node-role.kubernetes.io/master=true:NoSchedule' if k3s_master_taint else '' }}
--tls-san {{ apiserver_endpoint }}
--disable servicelb
--disable traefik
extra_agent_args: >-
{{ extra_args }}
# image tag for kube-vip
kube_vip_tag_version: "v0.5.7"
# image tag for metal lb
metal_lb_frr_tag_version: "v7.5.1"
metal_lb_speaker_tag_version: "v0.13.7"
metal_lb_controller_tag_version: "v0.13.7"
# metallb ip range for load balancer
metal_lb_ip_range: "10.0.3.90-10.0.3.100" Hosts
[master]
10.0.3.79
10.0.3.80
10.0.3.81
[node]
10.0.3.82
10.0.3.83
# only required if proxmox_lxc_configure: true
# must contain all proxmox instances that have a master or worker node
# [proxmox]
# 192.168.30.43
[k3s_cluster:children]
master
node Possible SolutionI was planning on setting up self-signed certs and seeing if that would work, but I'm just confused as why this wasn't experienced when you made the Video :). Thanks Tim! ObservationsFYI, I also noticed another error and fixed this by running
|
Beta Was this translation helpful? Give feedback.
-
Removing the 2nd/3rd master and trying this now. This passed the initial failure point
However it is now failing at
I find it strange that it is trying to fetch the CA cert (which doesn't exist anyway, as far as I'm aware), from the localhost address - ideas?
|
Beta Was this translation helpful? Give feedback.
-
In my case had the same fail point, steps that helped me: |
Beta Was this translation helpful? Give feedback.
-
Thanks @bornav let me double check on the local firewall. |
Beta Was this translation helpful? Give feedback.
-
@bornav your tip on checking the local fw was spot on. I have another ansible script I run on all my VMs as a "metal prep" type playbook that adds basic security and configs; this enables UFW and the basic config is to lock all ports down other than SSH. The test cluster boots up just fine now, thanks! |
Beta Was this translation helpful? Give feedback.
In my case had the same fail point, steps that helped me:
make sure each host has a unique hostname,
make sure that hosts do not have any firewall rules blocking traffic(on all ports)