-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use an alternative machine bootstrap flag probing strategy (no SSH) #230
Comments
Since this issue is a huge blocker for us, I'd be really happy to discuss on this topic and actively work with you to find any possible Sentinel File Check alternative / optional strategy. |
This comment seems to show that the authors are aware of the limitation posed by the SSH strategy :
I'm working now on a proposal to (first) allow CAPK to optionally check a generic http endpoint for every VM (using TLS 1.3 / PSK key for each VM, replicating the same model used for SSH keys ). And define the a new (optional) TLS contract between CAPK and the VM. We'll (first) skip the server side implementation & injection in the VM, and only focus on a CAPK check feature. Letting the server side up to integration teams & final users. |
Hey, i'm back after a (not so) long silence :) ! We spent some time investigating more ways to poll Sentinel file status, and I think we have now a really elegant candidate to propose . With no SSH involved, no HTTP either or any remote call to the VM to make it work ! If you dive enough into kubevirt you'll see that it provides some sort of direct probing into the VM : guest-agent ping & exec probes. It relies on the existence of qemu-guest-agent in the VM ( kv guest agent ). By using this feature, the pod wrap & relays the probe exection though virt-launcher pod up to the VM. My proposal is now to make CAPK controller asking kubevirt apiserver to execute the exact same kind of check to probe the Sentinel File right into the VM. Everything is handled by the apiserver with nothing else involved. This is really Simple & elegant IMO. You can already validate the feasibility by running bash-5.1$ virt-probe --command cat /run/cluster-api/bootstrap-success.complete --domainName kaas_capiovn-cp-2dnzx
success And it's also perfectly working directly with > kubectl exec -n kaas virt-launcher-capiovn-cp-2dnzx-hp9zj -- virt-probe --command cat /run/cluster-api/bootstrap-success.complete --domainName kaas_capiovn-cp-2dnzx
success Do you think this approach is relevant ? Milestones to be achieved to make this proposal OK :
|
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
Another use-case: We are interested in using the talos bootstrap/control-plane providers with this infrastructure provider. Since Talos does not use SSH any dependency on SSH would be a hurdle for this idea. |
maybe it's time to go forward on this topic and finally remove any ssh requirement in capk ? |
For now I am setting |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What steps did you take and what happened:
At our company, we are building an infrastructure "highly secure" with several contraints imposed by French/European sovereign Cloud Label. Running these constraints made us locate CAPI management cluster in a specific network and managed clusters in other ones.
Working around these rules, we recently tried to block all the traffic between CAPI cluster network and target managed clusters networks (and also disabled SSH daemons in all our VMs).
In order to schedule and manage clusters lifecycle, we expected CAPI/CAPK to only reach managed cluster's apiserver using exposed loadbalanced endpoints, which is open the rest of the network using underlying Kubevirt LB capabilities.
In fact we discovered (here) that CAPK requires a direct SSH access to the VM IP in order to validate CAPI Machine bootstrap success (using CAPI sentinel file convention).
Also, this seems to be the unique SSH command I've found in the whole CAPK source code.
At the end with this restriction, CAPK is never able to correctly provision a single kubevirt VM, because VM bootstrap is never acknowledged.
The CAPI specification leaves the infrastructure providers free to choose the sentinel file verification methodology.
So I'd like to start a topic and try finding solutions to avoid doing such SSH connections, which are a very sensitive topic for us.
At the end, I'd love to have a more "read only" & auditable/secure way to check Machine bootstrap status
Possible anwsers could be :
What did you expect to happen:
In order to comply "governement-tier" security rules, we'd expect CAPK not to use any SSH remote access to check machine bootstrap success. Allowing a single component to hold SSH keys & reach every VM of the kube infrastructure breaks our required legal security compliance.
We think that retrieving the sentinel file status should rather be done with a read-only remote strategy using a less "privileged" & interactive protocol than SSH.
Environment:
kubectl version
): 1.26.2/etc/os-release
): Ubuntu 22.04.2 LTS/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api-provider-kubevirt/labels?q=area for the list of labels]
The text was updated successfully, but these errors were encountered: