-
Notifications
You must be signed in to change notification settings - Fork 55
Conversation
not continuous segment is different extent
fix bug of FibmapExtents function
cd1ff14
to
03d1d79
Compare
3f20187
to
3b83db4
Compare
We cannot always vendor some upstream component. If we can't, then we can fork it by copying the files into "third-party", either with plain "cp" or "git subtree", and the add our own changes on top of it.
…d92e31ef976 Forked because upstream author seems inactive (last message is an apology for being inactive: frostschutz/go-fibmap#1 (comment)), so we need to maintain this ourselves. git-subtree-dir: third-party/go-fibmap git-subtree-mainline: c451fa6 git-subtree-split: b32c231
The resulting file can be used as backing store for a QEMU nvdimm device. This is based on the approach that is used for the Kata Container rootfs (https://github.com/kata-containers/osbuilder/blob/dbbf16082da3de37d89af0783e023269210b2c91/image-builder/image_builder.sh) and reuses some of the same code, but also differs from that in some regards: - The start of the partition is aligned a multiple of the 2MiB huge page size (kata-containers/runtime#2262 (comment)). - The size of the QEMU object is the same as the nominal size of the file. In Kata Containers the size is a fixed 128MiB (kata-containers/osbuilder#391 (comment)).
The test needs files prepared as part of a cluster creation, without running in that cluster itself.
Same change as inside the PMEM-CSI driver itself: we have to ensure that reflink is off because it is incompatible with "-o dax".
The implementation already worked like that, it just wasn't documented and thus it was unknown whether reusing the directory also for other local state (like the upcoming extra volume mounts) is okay.
@devimc: this PR now has testing against and instructions for Kata Containers 1.11.0-rc0. Can you perhaps check that the changes in More feedback of course also welcome 😀 I'm pushing this while local tests are still running, but hopefully the new tests work now. |
kata-deploy fails in our CI (https://cloudnative-k8sci.southcentralus.cloudapp.azure.com/view/pmem-csi/job/pmem-csi/view/change-requests/job/PR-500/29/console):
I'll check tomorrow why it fails there. I worked for me locally. |
Same error also for other nodes. It looks like we don't have nested virtualization enabled in the Azure VM. Let me see whether I can change that... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @pohly - lgtm
docs/design.md
Outdated
|
||
This gets solved as follows: | ||
- PMEM-CSI creates a volume as usual, either in direct mode or LVM mode. | ||
- Inside that volume it sets up an ext4 filesystem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jfyi - xfs is also supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Progress (?): without the |
@devimc explained on IRC that triple nesting of VMs makes the inner QEMU so slow that the kubelet -> CRI communication times out. This means we cannot test with Kata Containers in the current Azure CI. I'll make the tests optional. Until we have automatic testing on real hardware (BMaaS!), we'll simply have to test them manually from time to time on real hardware to detect regressions. |
2060858
to
a6e0943
Compare
Tests are clean now after disabling the Kata Containers tests in our Azure CI. @avalluri: okay to merge? |
reclaimPolicy: Delete | ||
volumeBindingMode: Immediate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per [app yaml(https://github.com//pull/500/files#diff-17203e3a5882efb0c1944558564a9c52R10-R11] , the application is expected to run only on nodes with katacontainers.io/kata-runtime: "true"
. So, if we use immediate binding mode might end up creating a node on the wrong node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I should better switch this to "late-binding".
docs/design.md
Outdated
space available in the volume. | ||
- That partition is bound to a `/dev/loop` device and the formatted | ||
with the requested filesystem type for the volume. | ||
- When an applications needs access to the volume, PMEM-CSI mounts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: an applications -> an application
@pohly looks good to me. Just go ahead and merge after fixing the storage classes and the documentation nit. |
A persistent or ephemeral volume can either be prepared for usage by DAX-enabled applications that don't run under Kata Containers (the default) or for DAX-enabled applications that run under Kata Containers. In both cases the volume can be used with and with Kata Containers, it's just that DAX only works either inside or ourside of Kata Containers. The Kata Container runtime must be able to access the image file while it is still mounted, therefore we cannot use something inside the target dir as mount point, because then the image file is shadowed by the mounted filesystem. We already have a local state dir for .json files. Putting something else inside it might confuse the state code, so instead we create a second directory with ".mount" appended to the directory name and use that for mount points. We also have to enable bi-directional mount propagation for it because otherwise the mounted fs with the image file is still only visible inside the container).
It can happen that Kubernetes comes up, but something else (like Kata Containers) doesn't. In that case "kubectl get all" may provide some hint.
Two minutes was enough locally, but not for the CI.
"make start" in an empty _work failed with: tar zxf _work/govm_0.9-alpha_Linux_amd64.tar.gz -C _work/bin/ tar: _work/bin: Cannot open: No such file or directory
Due to a race condition (?), kata-deploy fails in the CI because /etc/crio/crio.conf didn't exist at the time that it ran: $ kubectl logs -n kube-system kata-deploy-2dh2f copying kata artifacts onto host Add Kata Containers as a supported runtime for CRIO: cp: cannot stat '/etc/crio/crio.conf': No such file or directory Somehow it worked locally.
Even with the "-vmx" override in the Jenkinsfile removed, nested virtualization with three levels (Azure HyperV -> QEMU (govm) -> QEMU (Kata Containers)) was not working well enough for Kata Containers: because the inner VM runs very slowly, there are timeouts in the communication between kubelet and Kata Containers ("container create failed: Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing"). That means that testing with Kata Containers has to be limited to bare metal. To achive that, it's turned off by default (and thus in the CI, which only runs on Azure) and has to be enabled with TEST_KATA_CONTAINERS_VERSION=1.11.0-rc0 or by invoking test/setup-kata-containers.sh manually.
Contains support for creating image files, installing Kata Containers and running E2E tests with that. However, those tests need to be run manually on bare-metal hosts because triple-nested virtualization (like we would have to do in our Azure CI) is too slow.