-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Simulation of automatically provisioned ReadWriteMany PVs #1487
Comments
NFS from an overlayfs requires a 4.15+ kernel IIRC. I don't think we want to start imposing any kernel requirement yet, or the overhead of running & managing NFS by default. kind of course supports installing additional drivers, preferably with CSI. IMHO it makes more sense to run this as an addon. cc @msau42 @pohly. We can discuss other Read* modes upstream in the rancher project. |
Ah, I hadn't had a need for RWM. https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes even I don't think rancher / local storage is going to do read across nodes 😅 probably the best solution here is to document some yaml to apply for getting an NFS provisioner installed on top of a standard kind cluster. |
@BenTheElder ah, didn't know NFS required a newer kernel in this instance. Would it be possible to do something similar with docker volumes instead? The following docker-compose example should back the containers with a shared volume that is consistent-ish: version: "2.3"
services:
control-plane0:
image: k8s.gcr.io/pause
volumes:
- rwmpvc:/rwmpvc
worker0:
image: k8s.gcr.io/pause
volumes:
- rwmpvc:/rwmpvc
worker1:
image: k8s.gcr.io/pause
volumes:
- rwmpvc:/rwmpvc
volumes:
rwmpvc: The ouput of
Using an approach like this would not require any NFS server to be run internally in the containers. The PV provisioner just needs to consistently derive the host path in a similar way like |
This will work in backends where the nodes are all on a single machine (which we may not guarantee in the future) IF we write a custom provisioner. IMHO it's better to just provide an opt-in NFS solution you can deploy and document it. It should just be a kubectl apply away from installing an NFS provisioner as long as you have an updated kernel. |
Agree, I think a opt-in NFS tutorial would be the best option here for users that need it. We don't have any great options from sig-storage perspective, most solutions already assume you have an nfs server setup somewhere.
|
I don't know if this is possible but would there be some way to abstract this from the end user using some method of packaging and enabling "addons" similar to minikube? I don't know about the long term goals of I don't know if addons are a clean way of accomplishing this goal but I think the utility of Interested in your thoughts. |
Hi, regarding addons: we're not bundling addons at this time. That approach tends to be problematic for users as it couples the lifecycle of the addons to the version of the cluster tool. SIG Cluster Lifecycle seems to agree and the future of addon work there seems to be the cluster addons project, which involves a generic system on top of any cluster. We're tracking that work and happy to integrate when it's ready #253 In the meantime addons tend to not be any different from any other cluster workload, they can be managed with kubectl, helm, kustomize, kpt, etc. For an example of a more involved "addon" that isn't actually bundled with kind config dependencies see https://kind.sigs.k8s.io/docs/user/ingress/
This gives a rough idea where our priorities are at, which do include supporting this more or less
We have a KubeCon talk about this: https://kind.sigs.k8s.io/docs/user/resources/#testing-your-k8s-apps-with-kind--benjamin-elder--james-munnelly
Clusters have a standard API in
For these you'll want to provide your own wrapper of some sort to ensure that the kind cluster matches your prod more closely (e.g. mimicking the custom storage classes from your prod cluster, trying to run a similar or the same ingress..) |
|
(also confirmed that it works, the kubernetes NFS e2e tests pass) |
requires 4.16 kernel https://www.phoronix.com/scan.php?page=news_item&px=OverlayFS-NFS-Export-Linux-4.16 |
Just did a verification of this feature. I first made sure kubernetes was cloned to I then built my own node-image using the latest base-image with nfs-common via the following (takes a while!) kind build node-image --image kindest/node:master --base-image kindest/base:v20200610-99eb0617 --kube-root "${GOPATH}/src/k8s.io/kubernetes" Next i created a cluster using the new node-image via kind create cluster --config kind-config.yaml Using the following kind-config.yaml kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
image: kindest/node:master I then pulled and loaded the nfs-provisioner image to prepare for installation docker pull quay.io/kubernetes_incubator/nfs-provisioner
kind load docker-image quay.io/kubernetes_incubator/nfs-provisioner The provisioner could then be installed via Helm (Helm was installed separately). helm repo add stable https://kubernetes-charts.storage.googleapis.com/
helm install nfs-provisioner stable/nfs-server-provisioner And I was then finally able to to provision a NFS volume via the following PVC ---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: test-dynamic-volume-claim
spec:
storageClassName: "nfs"
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Mi Everything worked like a charm - looking forward to the next Kind release :) |
Nice ! I am currently looking for this. When this will be released? |
0.9.0
delayed for various reasons. we'll re-evaluate and set a new target date
soon.
…On Fri, Jul 24, 2020 at 5:17 PM Noteworthy ***@***.***> wrote:
Nice ! I am currently looking for this. When this will be released?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1487 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHADKYJIF6PC4KE2MGE2OLR5IP75ANCNFSM4MKMFTTQ>
.
|
@BenTheElder any updates on the new target date? Trying to determine whether to base some internal setup on our own build of kind or whether there will be a release in the near future we can use instead. |
Sorry I missed this comment (sweeping issues now), v0.9.0 was re-scheduled to match k8s v1.19 but some last minute fixes are still pending so we didn't cut the release today (k8s did). I expect to have those merged by tomorrow. |
This is side note, but might be useful for someone. When I updated node image from 18.8 to 19.1 then NFS helm chart does not work properly: memory is filled up in few seconds. I investigated the problem and it seems rpc.statd from nfs-utils package is outdated and it is leaking memory. |
that's unfortunate. we're shipping the latest available in the distro at the moment (ubuntu 20.10), if it's fixed in ubuntu we'll pick it up in a future kind image. |
@BenTheElder Now I think it might be something different. That's how I reproduce issue:
Issue is present when I use most recent node images:
List of node images that works without problem:
Note: I tried building latest node image from kind:v0.9.0 sources and it works fine 😕 |
1.19.0 isn't a latest image (please see the kind release notes as usual) and all of the images that are current were built with the same version, there were no changed to the base image or node image build process between those builds and tagging the release. |
@BenTheElder Sorry, I pasted corrected digest
So issue is present for latest node image. I am trying to track down what was changed during latest node images update. |
I' m almost sure is because of this but I keep thinking that is an nfs bug 😄 @koxu1996 you should limit the filedescriptor at the OS level |
@aojea Indeed, I bisected KinD commits and this is the culprit: 2f17d25. I use Arch BTW 😆 and kernel-limit of file descriptors is really high:
To workaround the NFS issue you can change kernel-level limits, eg.
or you could use custom node image. Edit: I asked nfs-utils maintainer about this bug and got following reply:
|
looks like libtirpc is not packaged yet. I'm not sure how we want to proceed here |
I think we should try to make sure libtirpc is updated and document how to setup an NFS provisioner, I'm not sure if this is in scope to have in the default setup, but it's certainly in scope to put a guide on the site. |
This fails with an error saying,
|
well first of all you should not need to build new node imaiges, we've had multiple releases since #1487 (comment), they already contain all of the changes. ... and the reason that's failing is the base image specified in the command in that comment is very outdated versus current kind. you can skip all the image building steps, NFS should just work now, we run NFS tests in CI. There's no changes to kind needed, just the cluster objects installed at runtime for your NFS service / PVs. |
Hey @BenTheElder thanks for the comment, but when I tried using the storage class nfs it went into the pending state describeing the pvc showed that it doesn't have the storage class "nfs", I understand that you have suggested to run a nfs server some where, but my question is in current version of kind can we do (after having my nfs server) pvc's with access mode ReadWriteMany, I went through the issues inorder to find something on this but was not able to find, any help or suggestions would be wonderful |
yes, we don't have the storage class because that has to refer to a specific NFS setup, and that's something you can choose and install at runtime
yes, #1487 (comment) starting from "I then pulled and loaded the nfs-provisioner image to prepare for installation" is still relevant as one approach. The part before that with the custom image is not.
Yes, in any version nfs has readwritemany, it's just that NFS could not work in a nested container environment when the project was started (issues in the linux kernel actually, not in kind itself). It can now. (see also: #1806) I don't specifically work with this, but NFS in kind is not special (versus another cluster tool) anymore. We just need someone to document doing this. |
Thanks I am still in a phase of understanding and learning about K8, and many thanks to the devs and contributors of kind , I will see how to do it thanks, 😃 |
Hi @BenTheElder so thanks for all the guidance and ideas I was successfully able to deploy a NFS server with mode RWM using the steps that you and other devs indicated on a Linux system, but now when I am trying to move the same setup on a Mac ( Docker desktop) , I could see that the pos for nfs provisioner is failling with (upon describing)
And then it eventually gets into crash loop, I found this answer suggesting some change but I would like to understand what exactly have changed between the two systems, could it be because of the resources, as on Linux system the the KInd cluster was flying with 24G ram but here on Mac its 6 CPUs, 4 GB mem 2GB Swap and 200GB HHD, Thanks |
You should also consider running less nodes, kind tries to be as light as possible but kubeadm recommends something like 2gb for node for a more typical cluster IIRC 😅 Kubernetes does not yet use swap effectively, and actually officially requires it to be off, though we set an option to make it run anyhow. node.kubernetes.io/not-ready is not a taint you should have to remove and kind in general should not require you to manually remove taints ever, this means the nodes are not healthy (which is a very general symptom) EDIT: If you need more help with that please file a different issue for your case since it's not related to RWM PVs, so folks monitoring this can avoid being notified spuriously, and so we can track your issue directly. We can cross link them for reference. The new issue template also requests useful information for debugging. |
On the off chance anyone is still watching this, local-path-provisioner has supported RWX volumes for a few releases now, and with v0.0.27 now supports multiple storage classes with a single deployment. Unless I've overlooked something, I think it should be reasonable to automatically create a RWX storage class for single-node clusters. To support multi-node clusters, that could be accomplished by mounting the same host volume to the same location in each node container, and that could be provided by a new field in the configuration. This would even support future multi-host setups if the user is made responsible for mounting network storage at that location on each host out-of-band. I would be happy to start work on a PR for this if the idea isn't rejected out of hand. |
@meln5674 I created a workaround in my environment for this:
As long as you use only one node configuration this works totally fine. See https://github.com/rancher/local-path-provisioner?tab=readme-ov-file#definition for details |
What would you like to be added: A method to provide automatically provisioned ReadeWriteMany PVs that are available on all workers.
Currently the storage provisioner that is being used can only provision
Why is this needed: The current volume provisioner that is being only supports creating ReadWriteOnce volumes. This is because kind is using the rancher local-path-provisioner and they hard code their provisioner to disallow any PVCs with an access mode other than ReadWriteOnce. Many managed kubernetes providers supply some type of distributed file system. I'm currently using Azure Storage File (which is SMB/cifs under the hood) for this use case in production. Google's Kubernetes Engine offers
ReadOnlyMany
out of the box.Possible solutions: Could we have the control plane node start up an NFS container backed by a ReadWriteOnce?
Thanks for your time!
The text was updated successfully, but these errors were encountered: