-
Notifications
You must be signed in to change notification settings - Fork 374
docker start fails with Kata and ACRN hypervisor #2026
Comments
/cc @jodh-intel @amshinde @devimc @jcvenegas @WeiZhang555 @bergwolf any thoughts on the issue? |
/cc @mcastelino. |
@vijaydhanraj If the state is being over-written, then this is a bug. |
Any update on this issue? This is one of the gating issues. |
With the latest changes, not even able to run a simple docker container. The below change fixes the issue.
|
@vijaydhanraj Can you add one more line into the configuration file(might be "/opt/kata/share/defaults/kata-containers/configuration.toml" if you install from the release tar)
and have another try? This can make the I think your modification could impact other hypervisor implementations but not sure, please file a PR and let's check the Jenkins test result. |
Hi @WeiZhang555 , I did try adding 'newstore' to the configuration.toml but my latest PR changes #2075 fail with newstore option. I am trying to create a global store which can be accessed by any kata-runtime process. With newstore option, the kata-runtime doesn't seem to use the already created store (in my case the UUID store). Is this the expectation? |
Since the SandboxState is not loaded, we see corruption of the sandbox state. The following logs show the same, Oct 06 22:41:32 ACRNSOS kata-runtime[1620]: time="2019-10-06T22:41:32.03749759-07:00" level=info msg="Sandbox State before LOAD= {State: BlockIndex:0 GuestMemoryBlockSizeMB:0 GuestMemoryHotplugProbe:false CgroupPath: PersistVersion:0}" arch=amd64 command=state container=dcebcea58d6666a538a321726cc724f3ca42f84962eb2ab688f6bdabe32e3286 name=kata-runtime pid=1620 sandbox=dcebcea58d6666a538a321726cc724f3ca42f84962eb2ab688f6bdabe32e3286 source=virtcontainers subsystem=sandbox With the below code changes, things seem to work fine.
|
The hypervisor.createSandbox may need to access the state. For eg, ACRN today needs to access the block index to assign it to the root image of the VM. Hence load this early on. Fixes kata-containers#2026 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
The hypervisor.createSandbox may need to access the state. For eg, ACRN today needs to access the block index to assign it to the root image of the VM. Hence load this early on. Fixes kata-containers#2026 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Description of problem
docker start fails with Kata and ACRN hypervisor. Followed the steps below,
Debug Status:
Based on initial debug found that docker stop was failing with
Sep 03 15:27:53 ACRNSOS kata-runtime[2304]: time="2019-09-03T15:27:53.844312685-07:00" level=error msg="deleteSandbox failed with err:StopSandbox failed with err:failed to Statfs "/var/run/netns/cni-49777597-ff42-8d5c-4fd7-801f95190c7e": no such file or directory" arch=amd64 command=delete container=a3348e8ed06e360bfec62282abccb9e441419e7ae22a508966f2829263c6b1d9 name=kata-runtime pid=2304 sandbox=a3348e8ed06e360bfec62282abccb9e441419e7ae22a508966f2829263c6b1d9 source=runtime.
Found out that since the state of the sandbox was not properly saved, delete OCI command was trying to remove already deleted file (kill OCI command deletes this file). The expectation is that sandbox state will be stopped, but in this case, it was empty "".
Root caused the reason for sandbox state being empty to GetAndSetSandboxBlockIndex(), which overwrites the sandbox state from stopped to empty. ACRN calls this function to increase the block device index (rootfs for the container). Since this logic is very specific to ACRN, the issue is not reproducible with other hypervisors such as QEMU or Firecracker.
Any reason why the GetAndSetSandboxBlockIndex() should store the sandbox state? If this needed, can a check be added to ensure the state being stored is a valid state instead of simply overwriting the existing state?
Please find the stack trace below,
/usr/local/go/src/runtime/debug/stack.go:24 +0x9f\ngithub.com/kata-containers/runtime/virtcontainers/store.
(*filesystem).store(0xc0000b0640, 0xc0002bd301, 0x557019ce6da0, 0xc0000b1080, 0x0, 0x0)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/store/filesystem_backend.go:234 +0x52f\ngithub.com/kata-containers/runtime/virtcontainers/store.
(*Store).Store(0xc000512f00, 0x557019ce6d01, 0x557019ce6da0, 0xc0000b1080, 0x0, 0x0)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/store/manager.go:241 +0x2ed\ngithub.com/kata-containers/runtime/virtcontainers/store.
(*VCStore).Store(0xc0004edbb0, 0xc0000f2d01, 0x557019ce6da0, 0xc0000b1080, 0x2, 0x3)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/store/vc.go:90 +0x52\ngithub.com/kata-containers/runtime/virtcontainers.
(*Sandbox).getAndSetSandboxBlockIndex(0xc0000f2c60, 0xc00053f100, 0x40, 0xc0000f2c60)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/sandbox.go:1626 +0xa7\ngithub.com/kata-containers/runtime/virtcontainers.
(*Sandbox).GetAndSetSandboxBlockIndex(...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/sandbox.go:1779\ngithub.com/kata-containers/runtime/virtcontainers.
(*acrn).appendImage(0xc000199340, 0xc0000b1040, 0x3, 0x4, 0xc0000d8c80, 0x4b, 0xc0000b1040, 0x3, 0x4, 0x3, ...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/acrn.go:173 +0x177\ngithub.com/kata-containers/runtime/virtcontainers.
(*acrn).buildDevices(0xc000199340, 0xc0000d8c80, 0x4b, 0xc0000d8c80, 0x4b, 0x0, 0x0, 0xc0002bd788)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/acrn.go:204 +0x290\ngithub.com/kata-containers/runtime/virtcontainers.
(*acrn).createSandbox(0xc000199340, 0x557019d90740, 0xc00024ab70, 0xc00053f100, 0x40, 0xc00053f140, 0x37, 0x0, 0x0,0x0, ...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/acrn.go:325 +0x299\ngithub.com/kata-containers/runtime/virtcontainers.
newSandbox(0x557019d90740, 0xc00024ab70, 0xc00053f100, 0x40, 0xc00051ff80, 0xc, 0xc00051ff78, 0x4, 0x100000001, 0x100000800, ...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/sandbox.go:583 +0xb84\ngithub.com/kata-containers/runtime/virtcontainers.
createSandbox(0x557019d90740, 0xc00024a900, 0xc00053f100, 0x40, 0xc00051ff80, 0xc, 0xc00051ff78, 0x4, 0x100000001, 0x100000800, ...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/sandbox.go:462 +0x16f\ngithub.com/kata-containers/runtime/virtcontainers.
fetchSandbox(0x557019d90740, 0xc0005436b0, 0xc0002e611d, 0x40, 0xc0002e611d, 0x40, 0xc000038ff0)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/sandbox.go:697 +0x2a4\ngithub.com/kata-co ntainers/runtime/virtcontainers.StatusContainer(0x557019d90740, 0xc0005436b0, 0xc0002e611d, 0x40, 0x7ffee76b6eab, 0x40, 0x0, 0x0, 0x0, 0x0, ...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/api.go:561 + 0x1f6\ngithub.com/kata-containers/runtime/virtcontainers.
(*VCImpl).StatusContainer(0x55701a56f130, 0x557019d90740, 0xc000543350, 0xc0002e611d, 0x40, 0x7ffee76b6eab, 0x40, 0x0, 0x0, 0x0, ...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/virtcontainers/implementation.go:117 +0xb0\nmain.getContainerInfo(0x557019d90740, 0xc000543350, 0x7ffee76b6eab, 0x40, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/cli/oci.go:52 +0x201\nmain.
getExistingContainerInfo(0x557019d90740, 0xc000543350, 0x7ffee76b6eab, 0x40, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/cli/oci.go:61 +0x96\nmain.
state(0x557019d90740, 0xc000543350, 0x7ffee76b6eab, 0x40, 0x0, 0x0)/home/vdhanraj/go/src/github.com/kata-containers/runtime/cli/state.go:53 +0x29a\nmain.
glob..func16(0xc0000f29a0, 0x0, 0xc0000f29a0)/home/vdhanraj/go/src/github.com/kata-containers/runtime/cli/state.go:39 +0x234\ngithub.com/kata-containers/runtime/vendor/github.com/urfave/cli.
HandleAction(0x557019bef560, 0x557019d58ae8, 0xc0000f29a0, 0xc0000ae600, 0x0)\n\t/home/vdhanraj/go/src/github.com/kata-containers/runtime/vendor/github.com/urfave/cli/app.go:490 +0xca\ngithub.com/kata-containers/runtime/vendor/github.com/urfave/cli.
Command.Run(0x55701960805c, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x557019624914, 0x1f, 0x0, ...)/home/vdhanraj/go/src/github.com/kata-containers/runtime/vendor/github.com/urfave/cli/command.go:210 +0x998\ngithub.com/kata-containers/runtime/vendor/github.com/urfave/cli.
(*App).Run(0xc00030e000, 0xc0000b2000, 0x9, 0x9, 0x0, 0x0)\n\t/home/vdhanraj/go/src/github.com/kata-containers/runtime/vendor/github.com/urfave/cli/app.go:255 +0x6b1\nmain.createRuntimeApp(0x557019d906c0, 0xc0000a8010, 0xc0000b2000, 0x9, 0x9, 0x0, 0xc0002bff58)\n\t/home/vdhanraj/go/src/github.com/kata-containers/runtime/cli/main.go:478 +0x22d
main.createRuntime(0x557019d906c0, 0xc0000a8010)\n\t/home/vdhanraj/go/src/github.com/kata-containers/runtime/cli/main.go:521 +0x7c\nmain.
main()\n\t/home/vdhanraj/go/src/github.com/kata-containers/runtime/cli/main.go:553 +0x9e\n"
Expected result
root@ACRNSOS~ # sudo docker start CL_bash
CL_bash
Actual result
root@ACRNSOS~ # sudo docker start CL_bash
Error response from daemon: id already in use
Error: failed to start containers: CL_bash
Show kata-collect-data.sh details
Meta details
Running
kata-collect-data.sh
version1.8.0-alpha0 (commit 21c1e374c8ade0531eb862c0e0462f39d6331aca-dirty)
at2019-09-03.15:28:17.540225264-0700
.Runtime is
/usr/local/bin/kata-runtime
.kata-env
Output of "
/usr/local/bin/kata-runtime kata-env
":Runtime config files
Runtime default config files
Runtime config file contents
Output of "
cat "/etc/kata-containers/configuration.toml"
":Output of "
cat "/usr/share/defaults/kata-containers/configuration.toml"
":KSM throttler
version
Output of "
--version
":systemd service
Image details
Initrd details
No initrd
Logfiles
Runtime logs
Recent runtime problems found in system journal:
Proxy logs
Recent proxy problems found in system journal:
Shim logs
Recent shim problems found in system journal:
Throttler logs
No recent throttler problems found in system journal.
Container manager details
Have
docker
Docker
Output of "
docker version
":Output of "
docker info
":Output of "
systemctl show docker
":No
kubectl
Have
crio
crio
Output of "
crio --version
":Output of "
systemctl show crio
":Output of "
cat /etc/crio/crio.conf
":Have
containerd
containerd
Output of "
containerd --version
":Output of "
systemctl show containerd
":Output of "
cat /etc/containerd/config.toml
":Packages
No
dpkg
Have
rpm
Output of "
rpm -qa|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-ksm-throttler|kata-containers-image|linux-container|qemu-)"
":The text was updated successfully, but these errors were encountered: