Replies: 14 comments
-
A few updates from further testing today.
|
Beta Was this translation helpful? Give feedback.
-
I booted #/bin/bash
IMG=quay.io/kairos/core-rockylinux:v2.3.1
docker run --rm -ti --net host quay.io/kairos/auroraboot \
--set "container_image=$IMG" and Virtualbox. I got the same rsync error as in the description. I copied the latest I get tons of rsync output which ends in this: |
Beta Was this translation helpful? Give feedback.
-
Something that may or may not be relevant. I also can't scp (or ssh) to the VM:
If the network stack of the VM is not working for whatever reason, that could explain why rsync is failing. Again, this may be irrelevant but good to keep in mind. |
Beta Was this translation helpful? Give feedback.
-
Simply changing to
The config I used:
Something seems to be wrong with the network in general in rockylinux. |
Beta Was this translation helpful? Give feedback.
-
@mudler suggested that it might be selinux causing the issues. To test we can just disable it. Either manually in the cmdline (through grub editing mode) or through Auroraboot:
Docs: https://kairos.io/docs/reference/auroraboot/#configuration Let's give this a try and see |
Beta Was this translation helpful? Give feedback.
-
I had a test and without the |
Beta Was this translation helpful? Give feedback.
-
This should fix booting not only rockylinunx but anything else that has selinux enabled by default: kairos-io/AuroraBoot#36 |
Beta Was this translation helpful? Give feedback.
-
FYI, this looks good to me, the only thing missing from it is removing the /etc/machine-id during building, so you can deploy this to several nodes without issues. machine-id is used to identify the machine on dhcp, so it should be unique. We generate one on boot if there is none and back it up on persistent to machines get the same ip across reboots. |
Beta Was this translation helpful? Give feedback.
-
That is a stripped down base used for new VMs Each new VM needs a new ID so they don't conflict. It's probably unique to our use case. |
Beta Was this translation helpful? Give feedback.
-
Thank you! I super appreciate you guys hopping on this and getting it figured out. To help me plan the next day or two, when should I expect a fixed version of AuroraBoot to be available? Looks like the PR failed when the ARM disk image creation failed. :) (I know your pain.) Once that is fixed, will a new nightly get kicked out I can play with? |
Beta Was this translation helpful? Give feedback.
-
As soon as I can get the test fixed and its merged, we will cut a new aurora version :) So expect it today |
Beta Was this translation helpful? Give feedback.
-
@sarg3nt having issues with this, you can workaround by passing this flag to aurora meanwhile:
So that disables selinux from the ipxe boot |
Beta Was this translation helpful? Give feedback.
-
I would directly clean it up in the Dockerfile. There is already a config in the default yip configs to gather a new one and store it in persistent on boot so it should not cause problems to remove, in fact we remove it completely from our base images: https://github.com/kairos-io/kairos/blob/master/Earthfile#L393 |
Beta Was this translation helpful? Give feedback.
-
It seems that the original issue was fixed with |
Beta Was this translation helpful? Give feedback.
-
Kairos version:
core-rockylinux:v2.3.x
CPU architecture, OS, and Version:
AMD 64, Rocky Linux Core
Describe the bug
cloud_init.yaml
file but works fine when an ISO is created with AuroraBoot and a VM uses said ISO to install, this is regardless of thecloud_init.yaml
file that is provided assuming said file is valid. If you try to deploy this image with AuroraBoot you will receive the error shown in the following screen shot where it complains that rsync finished with errors: exit status 23 and Failed to start the Kairos installercloud_init.yaml
file provided. Many validcloud_init.yaml
options will not work however and it is VERY temperamental. Again, an ISO generated with AuroraBoot and uploaded manually to VSphere then a VM created with said ISO, works fine every time. When netbooting with AuroraBoot does not work you will get the error shown in the following screen shot where it complains that it cannot unmount/run/cos/active
and/run/cost/state
and the Kairos installer failsI need to test some sanitized
cloud_init.yaml
configs that work and break it before I can provide examples of this but will get those to you soon.cloud_init.yaml
file with the core-opensuse-leap:v2.3.1 image via netboot with AuroraBoot worked fine, however during this testing I ran into an issue where it would not deploy as copying of assets was taking forever with DRACUT messages showing slow progress and it eventually timing out. Undeploying and redeploying the Auroraboot image fixed it. This is probably unrelated to the core issue here though.I've added a lot more resources to the AuroraBoot VM including drive space, memory and CPUs and they had no effect (4 CPU cores, 32 gigs of RAM and 80 GB boot drive). Note: The AuroraBoot image is based on a Rocky Linux OS built with Kairos with Docker installed.
cloud_init.yaml
file lately.Note: This defect has been discussed in the Slack channel in this thread
Beta Was this translation helpful? Give feedback.
All reactions