Skip to content
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.

Kata Container support #500

Merged
merged 31 commits into from
May 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
b48fa8d
fibmap package
frostschutz Feb 12, 2014
a187192
fibmap as dedicated C file
frostschutz Feb 12, 2014
f41b08f
figetbsz()
frostschutz Feb 13, 2014
229abe3
fibmap()
frostschutz Feb 13, 2014
480b3ee
fiemap()
frostschutz Feb 14, 2014
26f71ce
Datahole()
frostschutz Feb 15, 2014
db66097
delete C file, stick with native (unsafe) Go implementation for now
frostschutz Feb 16, 2014
d324314
FibmapFile type, to keep track of extents and offsets in the future
frostschutz Feb 17, 2014
316ee4d
FibmapExtents() emulate FIEMAP with FIBMAP
frostschutz Feb 18, 2014
27496d1
Fallocate()/PunchHole()
frostschutz Feb 19, 2014
7b72b12
MixedCaps for local constants
frostschutz Feb 21, 2014
28cd1bf
initialize repository
frostschutz Apr 25, 2014
77fdb38
MIT license
frostschutz Apr 25, 2014
dccaece
Update fibmap.go
chenzhongtao Aug 11, 2016
b32c231
Merge pull request #2 from chenzhongtao/master
frostschutz Aug 25, 2016
be6fe65
third-party: support forking of upstream components
pohly Dec 17, 2019
18b96aa
third-party: add go-fibmap from commit 'b32c231bfe6a911d413c4a240f560…
pohly Mar 31, 2020
997a767
imagefile: embed namespace and filesystem inside a file
pohly Dec 16, 2019
063bb1e
CI: integrate imagefile testing
pohly Dec 18, 2019
f26ee12
imagefile: disable reflink for XFS
pohly Jan 10, 2020
9589d62
imagefile: remove second MBR
pohly Jan 10, 2020
24d8a58
imagefile: allow allocating file with maximum size
pohly Jan 14, 2020
16409d1
Merge branch 'fix-kustomize' into HEAD
pohly May 8, 2020
65804f6
check-imagefile.sh: retry SSH
pohly May 8, 2020
5fe60ce
pmem state: document that only .json files matter
pohly Feb 26, 2020
477a33b
support Kata Containers
pohly Jan 13, 2020
d9f6adc
test: dump Kubernetes objects after setup failure
pohly May 10, 2020
9e7ff93
test: increase timeout for Kata Containers
pohly May 10, 2020
2001229
test: create target directories
pohly May 11, 2020
9bc0b3b
test: fix kata-deploy on Clear
pohly May 11, 2020
4dcfcf5
test: disable Kata Containers by default
pohly May 11, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
! pkg/**
! test/test-config.d/**
! test/test-config.sh
! third-party/**
! hack/**
! vendor/**
! go.mod
Expand Down
3 changes: 2 additions & 1 deletion Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,8 @@ pipeline {
// Install additional tools:
// - ssh client for govm
// - python3 for Sphinx (i.e. make html)
sh "docker exec ${env.BUILD_CONTAINER} swupd bundle-add openssh-client python3-basic"
// - parted, xfsprogs, os-cloudguest-aws (contains mkfs.ext4) for ImageFile test
sh "docker exec ${env.BUILD_CONTAINER} swupd bundle-add openssh-client python3-basic parted xfsprogs os-cloudguest-aws"

// Now commit those changes to ensure that the result of "swupd bundle add" gets cached.
sh "docker commit ${env.BUILD_CONTAINER} ${env.BUILD_IMAGE}"
Expand Down
4 changes: 3 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ KUSTOMIZE_OUTPUT += deploy/common/pmem-storageclass-cache.yaml
KUSTOMIZATION_deploy/common/pmem-storageclass-cache.yaml = deploy/kustomize/storageclass-cache
KUSTOMIZE_OUTPUT += deploy/common/pmem-storageclass-late-binding.yaml
KUSTOMIZATION_deploy/common/pmem-storageclass-late-binding.yaml = deploy/kustomize/storageclass-late-binding
kustomize: $(KUSTOMIZE_OUTPUT)
kustomize: clean_kustomize_output $(KUSTOMIZE_OUTPUT)
$(KUSTOMIZE_OUTPUT): _work/kustomize $(KUSTOMIZE_INPUT)
$< build --load_restrictor none $(KUSTOMIZATION_$@) >$@
if echo "$@" | grep -q '/pmem-csi-'; then \
Expand All @@ -185,6 +185,8 @@ $(KUSTOMIZE_OUTPUT): _work/kustomize $(KUSTOMIZE_INPUT)
cp $@ $$dir/pmem-csi.yaml && \
echo 'resources: [ pmem-csi.yaml ]' > $$dir/kustomization.yaml; \
fi
clean_kustomize_output:
rm -f $(KUSTOMIZE_OUTPUT)

# Always re-generate the output files because "git rebase" might have
# left us with an inconsistent state.
Expand Down
26 changes: 26 additions & 0 deletions deploy/common/pmem-kata-app-ephemeral.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
kind: Pod
apiVersion: v1
metadata:
name: my-csi-app-inline-volume
labels:
io.katacontainers.config.hypervisor.memory_offset: "2147483648" # 2Gi, must be at least as large as the PMEM volume
spec:
# see https://github.com/kata-containers/packaging/tree/1.11.0-rc0/kata-deploy#run-a-sample-workload
runtimeClassName: kata-qemu
nodeSelector:
katacontainers.io/kata-runtime: "true"
containers:
- name: my-frontend
image: intel/pmem-csi-driver-test:canary
command: [ "sleep", "100000" ]
volumeMounts:
- mountPath: "/data"
name: my-csi-volume
volumes:
- name: my-csi-volume
csi:
driver: pmem-csi.intel.com
fsType: "xfs"
volumeAttributes:
size: "2Gi"
kataContainers: "true"
22 changes: 22 additions & 0 deletions deploy/common/pmem-kata-app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
kind: Pod
apiVersion: v1
metadata:
name: my-csi-kata-app
labels:
io.katacontainers.config.hypervisor.memory_offset: "2147483648" # 2Gi, must be at least as large as the PMEM volume
spec:
# see https://github.com/kata-containers/packaging/tree/1.11.0-rc0/kata-deploy#run-a-sample-workload
runtimeClassName: kata-qemu
nodeSelector:
katacontainers.io/kata-runtime: "true"
containers:
- name: my-frontend
image: intel/pmem-csi-driver-test:canary
command: [ "sleep", "100000" ]
volumeMounts:
- mountPath: "/data"
name: my-csi-volume
volumes:
- name: my-csi-volume
persistentVolumeClaim:
claimName: pmem-csi-pvc-kata # see pmem-kata-pvc.yaml
11 changes: 11 additions & 0 deletions deploy/common/pmem-kata-pvc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pmem-csi-pvc-kata
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4Gi
storageClassName: pmem-csi-sc-ext4-kata # defined in pmem-storageclass-ext4-kata.yaml
13 changes: 13 additions & 0 deletions deploy/common/pmem-storageclass-ext4-kata.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: pmem-csi-sc-ext4-kata
parameters:
csi.storage.k8s.io/fstype: ext4
eraseafter: "true"
kataContainers: "true"
provisioner: pmem-csi.intel.com
reclaimPolicy: Delete
# Kata Containers might not be available on all nodes, wait for pod scheduling
# and then create volume on the chosen node(s).
volumeBindingMode: WaitForFirstConsumer
13 changes: 13 additions & 0 deletions deploy/common/pmem-storageclass-xfs-kata.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: pmem-csi-sc-xfs-kata
parameters:
csi.storage.k8s.io/fstype: xfs
eraseafter: "true"
kataContainers: "true"
provisioner: pmem-csi.intel.com
reclaimPolicy: Delete
# Kata Containers might not be available on all nodes, wait for pod scheduling
# and then create volume on the chosen node(s).
volumeBindingMode: WaitForFirstConsumer
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/direct/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /sys
name: sys-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/direct/testing/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /sys
name: sys-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/lvm/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- args:
- -v=3
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/lvm/testing/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /var/lib/pmem-csi-coverage
name: coverage-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/pmem-csi-direct-testing.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /sys
name: sys-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/pmem-csi-direct.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /sys
name: sys-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/pmem-csi-lvm-testing.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /var/lib/pmem-csi-coverage
name: coverage-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/pmem-csi-lvm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- args:
- -v=3
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/pmem-storageclass-ext4-kata.yaml
1 change: 1 addition & 0 deletions deploy/kubernetes-1.15/pmem-storageclass-xfs-kata.yaml
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/direct/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /sys
name: sys-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/direct/testing/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /sys
name: sys-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/lvm/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- args:
- -v=3
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/lvm/testing/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /var/lib/pmem-csi-coverage
name: coverage-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/pmem-csi-direct-testing.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /sys
name: sys-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/pmem-csi-direct.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /sys
name: sys-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/pmem-csi-lvm-testing.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- mountPath: /var/lib/pmem-csi-coverage
name: coverage-dir
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/pmem-csi-lvm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,7 @@ spec:
- mountPath: /dev
name: dev-dir
- mountPath: /var/lib/pmem-csi.intel.com
mountPropagation: Bidirectional
name: pmem-state-dir
- args:
- -v=3
Expand Down
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/pmem-storageclass-ext4-kata.yaml
1 change: 1 addition & 0 deletions deploy/kubernetes-1.16/pmem-storageclass-xfs-kata.yaml
5 changes: 5 additions & 0 deletions deploy/kustomize/driver/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,11 @@ spec:
mountPath: /dev
- name: pmem-state-dir
mountPath: /var/lib/pmem-csi.intel.com
# Needed for Kata Containers: we mount the PMEM volume inside our
# state dir and want that to be visible also on the host, because
# the host will need access to the image file that we create inside
# that mounted fs.
mountPropagation: Bidirectional
- name: driver-registrar
imagePullPolicy: Always
image: quay.io/k8scsi/csi-node-driver-registrar:v1.X.Y
Expand Down
44 changes: 43 additions & 1 deletion docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
- [Architecture and Operation](#architecture-and-operation)
- [LVM device mode](#lvm-device-mode)
- [Direct device mode](#direct-device-mode)
- [Kata Containers support](#kata-containers-support)
- [Driver modes](#driver-modes)
- [Driver Components](#driver-components)
- [Communication between components](#communication-between-components)
Expand Down Expand Up @@ -126,6 +127,47 @@ In direct device mode, the driver does not attempt to limit space
use. It also does not mark "own" namespaces. The _Name_ field of a
namespace gets value of the VolumeID.

## Kata Container support

[Kata Containers](https://katacontainers.io) runs applications inside a
virtual machine. This poses a problem for App Direct mode, because
access to the filesystem prepared by PMEM-CSI is provided inside the
virtual machine by the 9p or virtio-fs filesystems. Both do not
support App Direct mode:
- 9p does not support `mmap` at all.
- virtio-fs only supports it when not using `MAP_SYNC`, i.e. without dax
semantic.

This gets solved as follows:
- PMEM-CSI creates a volume as usual, either in direct mode or LVM mode.
- Inside that volume it sets up an ext4 or xfs filesystem.
- Inside that filesystem it creates a `pmem-csi-vm.img` file that contains
partition tables, dax metadata and a partition that takes up most of the
space available in the volume.
- That partition is bound to a `/dev/loop` device and the formatted
with the requested filesystem type for the volume.
- When an application needs access to the volume, PMEM-CSI mounts
that `/dev/loop` device.
- An application not running under Kata Containers then uses
that filesystem normally *but* due to limitations in the Linux
kernel, mounting might have to be done without `-odax` and thus
App Direct access does not work.
- When the Kata Container runtime is asked to provide access to that
filesystem, it will instead pass the underlying `pmem-csi-vm.img`
file into QEMU as a [nvdimm
device](https://github.com/qemu/qemu/blob/master/docs/nvdimm.txt)
and inside the VM mount the `/dev/pmem0p1` partition that the
Linux kernel sets up based on the dax meta data that was placed in the
file by PMEM-CSI. Inside the VM, the App Direct semantic is fully
supported.

Such volumes can be used with full dax semantic *only* inside Kata
Containers. They are still usable with other runtimes, just not
with dax semantic. Because of that and the additional space overhead,
Kata Container support has to be enabled explicitly via a [storage
class parameter and Kata Containers must be set up
appropriately](install.md#kata-containers-support)

## Driver modes

The PMEM-CSI driver supports running in different modes, which can be
Expand Down Expand Up @@ -388,4 +430,4 @@ that don't use PMEM-CSI at all.

Users must take care to create PVCs first, then the pods if they want
to use the webhook. In practice, that is often already done because it
is more natural, so it is not a big limitation.
is more natural, so it is not a big limitation.
Loading