Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

oci: Fix running of OCI hooks #798

Merged
merged 1 commit into from
Jul 31, 2020
Merged

Conversation

bpradipt
Copy link
Contributor

@bpradipt bpradipt commented Jun 18, 2020

OCI hooks fails to run since the code was writing the config.json
to the read-only path. This patch fixes it

Fixes: #2763

Signed-off-by: Pradipta Kumar bpradipt@in.ibm.com

Copy link

@devimc devimc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @bpradipt - I have some questions

oci.go Outdated Show resolved Hide resolved
oci.go Outdated Show resolved Hide resolved
@bpradipt bpradipt force-pushed the hook-fix branch 2 times, most recently from da11c77 to eb36f82 Compare June 18, 2020 18:26
@bpradipt bpradipt changed the title oci: Fix running of OCI hooks [WIP]oci: Fix running of OCI hooks Jun 18, 2020
@bpradipt bpradipt force-pushed the hook-fix branch 2 times, most recently from 9611a94 to 0d2b88f Compare June 20, 2020 06:53
@codecov
Copy link

codecov bot commented Jun 20, 2020

Codecov Report

Merging #798 into master will increase coverage by 0.73%.
The diff coverage is 46.15%.

@@            Coverage Diff             @@
##           master     #798      +/-   ##
==========================================
+ Coverage   60.20%   60.94%   +0.73%     
==========================================
  Files          17       17              
  Lines        2656     2993     +337     
==========================================
+ Hits         1599     1824     +225     
- Misses        896     1003     +107     
- Partials      161      166       +5     

@bpradipt
Copy link
Contributor Author

@devimc @bergwolf can you please check this and advise

oci.go Outdated Show resolved Hide resolved
@bpradipt bpradipt changed the title [WIP]oci: Fix running of OCI hooks oci: Fix running of OCI hooks Jun 22, 2020
@bpradipt
Copy link
Contributor Author

/test

oci.go Show resolved Hide resolved
Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

However, I'd not merge till we also have @devimc's input (as Julio was part of the first loop(s)).

// writeSpecToFile writes the container's OCI spec to "/run/libcontainer/<container-id>/config.json"
// Note that the OCI bundle (rootfs) is at a different path
func writeSpecToFile(spec *specs.Spec, containerId string) error {
configJsonDir := filepath.Join(ociConfigBasePath, containerId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the container terminated, was the file be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm not sure. However given that there could be poststop hooks which will run after the container has stopped, the config json might not be removed by the runtime. @bergwolf would you have any idea ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do the cleanup in function: RemoveContainer https://github.com/kata-containers/agent/blob/master/grpc.go#L1252

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the /run/ folder is a tmpfs mount. The config.json should get removed automatically along with other files that are there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's only when the pod is stopped. But if the pod is still live and users create/stop many containers, then there would be much unused spec files left there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the scenario now. Thanks @lifupan. So in this case should we also handle cleanup of state.json that is kept under /run/libcontainer/<id>? And if yes, does it make sense to handle cleanup in a separate patch or club with this one ?
Also the impact of removing config.json on container poststop hook needs to be checked. Let me know your thoughts @lifupan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check if libcontainer removes /run/libocntainer/<id> entirely upon container removal? If so, there is nothing to be done in the agent code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bpradipt Lets make sure we clean up correctly in this patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bpradipt @bergwolf The dir /run/libocntainer/ was created by this patch in agent, I don't think libcontainer would do cleanup of this dir. BTW, the libcontainer would run he poststophooks in container's Destroy function at https://github.com/kata-containers/agent/blob/master/grpc.go#L1264 or https://github.com/kata-containers/agent/blob/master/grpc.go#L1279, so you can just do the cleanup behind that destroy.

Copy link

@devimc devimc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @bpradipt

@bpradipt
Copy link
Contributor Author

bpradipt commented Jul 4, 2020

/test-ubuntu

@bpradipt
Copy link
Contributor Author

bpradipt commented Jul 4, 2020

/test-wip-check

@bpradipt bpradipt added do-not-merge PR has problems or depends on another and removed do-not-merge PR has problems or depends on another labels Jul 4, 2020
@bpradipt
Copy link
Contributor Author

I have made the changes to remove the directory post container stop. However I noticed one oddity which I had missed earlier.

Semantically the hooks expect config.json to be in the bundlePath. However the bundlePath location is now read-only and we have to write the config.json to a different path. While the actual hooks code can definitely use the new path, but it means that the hooks that were working with standard runc containers will not work as-is with Kata containers.
I feel we'll need to think through this aspect since it affects end user experience.

@bergwolf @lifupan @fidencio @devimc let me know what you think and hope I'm not missing something fundamental here.

grpc.go Outdated
@@ -1273,6 +1273,9 @@ func (a *agentGRPC) RemoveContainer(ctx context.Context, req *pb.RemoveContainer
}
}
}
configJsonDir := filepath.Join("/run/libcontainer/", req.ContainerId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bpradipt
Here you just do the cleanup in case of the timeout == 0, how about
doing the cleanup at the bottom of this function or put it a delay function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lifupan yeah sure. I'll make the changes and test it out. Can you tell me what's the timeout is for? I didn't hit the else path during my local testing.

@amshinde
Copy link
Member

amshinde commented Jul 13, 2020

I have made the changes to remove the directory post container stop. However I noticed one oddity which I had missed earlier.

Semantically the hooks expect config.json to be in the bundlePath. However the bundlePath location is now read-only and we have to write the config.json to a different path. While the actual hooks code can definitely use the new path, but it means that the hooks that were working with standard runc containers will not work as-is with Kata containers.
I feel we'll need to think through this aspect since it affects end user experience.

@bpradipt Thanks for tackling this. This functionality was broken for a while since we moved to a read-only rootfs. We do not have CI today to catch this.
Your above concern looks valid to me. As I recall, OCI compliant hooks would expect bundlePath as the working directory during execution and the bundlePath is expected to contain both the container rootfs and config.json.
A OCI compliant hook may need to be rewritten, since the above expectation is not fulfilled with this PR.
What if we write the config.json file to the shared bundlePath from the host side, with the kata-runtime doing so.
@bergwolf @lifupan @devimc Do you see any issues with this approach?

(FYI : Discussion that we had on the original PR : #346 (comment) )

@amshinde
Copy link
Member

@bpradipt Would also like to see an integration test for this being added, this could just be a dummy oci compliant hooks binary, that reads the config.json to get the path to the rootfs and perform a simple mount in the rootfs.

@lifupan
Copy link
Member

lifupan commented Jul 14, 2020

I have made the changes to remove the directory post container stop. However I noticed one oddity which I had missed earlier.
Semantically the hooks expect config.json to be in the bundlePath. However the bundlePath location is now read-only and we have to write the config.json to a different path. While the actual hooks code can definitely use the new path, but it means that the hooks that were working with standard runc containers will not work as-is with Kata containers.
I feel we'll need to think through this aspect since it affects end user experience.

@bpradipt Thanks for tackling this. This functionality was broken for a while since we moved to a read-only rootfs. We do not have CI today to catch this.
Your above concern looks valid to me. As I recall, OCI compliant hooks would expect bundlePath as the working directory during execution and the bundlePath is expected to contain both the container rootfs and config.json.
A OCI compliant hook may need to be rewritten, since the above expectation is not fulfilled with this PR.
What if we write the config.json file to the shared bundlePath from the host side, with the kata-runtime doing so.
@bergwolf @lifupan @devimc Do you see any issues with this approach?

@amshinde There would be some changes to spec in the guest such as cpuset reset, if the hooks didn't care about this changes, I think writing from the host side is reasonable.

@bergwolf
Copy link
Member

@amshinde @bpradipt @lifupan the normal OCI container hooks are setup by containerd and kata runs them on the host side. Are you suggesting that kata should set it up and run in the guest instead?

@bpradipt
Copy link
Contributor Author

@amshinde @bpradipt @lifupan the normal OCI container hooks are setup by containerd and kata runs them on the host side. Are you suggesting that kata should set it up and run in the guest instead?

@bergwolf this is for OCI hooks running inside the Kata VM (https://github.com/kata-containers/runtime/blob/master/cli/config/configuration-qemu.toml.in#L288).

@amshinde
Copy link
Member

@amshinde @bpradipt @lifupan the normal OCI container hooks are setup by containerd and kata runs them on the host side. Are you suggesting that kata should set it up and run in the guest instead?

As @bpradipt This PR is for running the OCI hooks inside the guest. We run hooks passed in the OCI spec on the host, but we added this functionality to support use cases which require certain hooks to be run inside the guest, for eg in case of Nvidia GPU plugin. The hooks need to be part of the guest rootfs. We added this functionality as part of this PR : #393

But since we made the guest rootfs as read-only, the above functionality was broken.
If you look at this comment, #346 (comment), the config.json needs to be written in the bundle directory which was being done by the agent. This PR attempts to fix the read-only error caused by the agent attempting to write to the bundle directory in the guest by instead having the agent writing the config.json to a directory under /run. With this though the config.json no longer exists in the bundle directory which may break OCI hooks that depend on this.
My suggestion is to instead have the runtime write the config.json to the shared bundle directory before it is passed as read-only to the guest.

@dgibson
Copy link
Contributor

dgibson commented Jul 21, 2020

I have a whole batch of thoughts on this.

  1. I think always writing the config.json to the shared bundle directory would be really useful for debugging, regardless of its applicability to guest hooks

  2. It's not clear to me from the OCI spec whether hooks are allowed to alter the config.json. If so, that's obviously still a problem for guest hooks if it's in a read-only path.

  3. Treating guest hooks as kinda-OCI hooks seems bogus to me anyway. Unlike real OCI hooks they're controlled by the runtime's configuration, rather than the container's configuration. They also run in a different environment from that specified for OCI hooks. i.e. they're really not the same as OCI hooks.

Having guest side hooks is certainly useful for cases, but I think we should treat (and name) them as a kata specific thing, not pretend they have any real connection to OCI hooks. (The fact that they're implemented via OCI hooks executed on the "inner" container is an implementation detail that shouldn't be exposed in naming and configuration).

  1. Arguably the presence of guest hooks is a detail of the guest OS build, rather than something that should even appear in the runtime configuration.

  2. Regardless of how we manage the guest hooks, having some mechanism where we can alter the "inner" config.json from within the guest would be useful. We'll want something like this for the VFIO / SR-IOV stuff we're working on (to translate host side vfio devices into guest side vfio devices), and I can think of a number of other cases where we want to modify the inner config.json from the outer config.json due to things being renamed by the virtualization layer.

@dgibson
Copy link
Contributor

dgibson commented Jul 21, 2020

Of course all those thoughts are for the medium to long term. Short term is another question, which I also don't have any clear idea on.

@bpradipt
Copy link
Contributor Author

@amshinde @bpradipt @lifupan the normal OCI container hooks are setup by containerd and kata runs them on the host side. Are you suggesting that kata should set it up and run in the guest instead?

As @bpradipt This PR is for running the OCI hooks inside the guest. We run hooks passed in the OCI spec on the host, but we added this functionality to support use cases which require certain hooks to be run inside the guest, for eg in case of Nvidia GPU plugin. The hooks need to be part of the guest rootfs. We added this functionality as part of this PR : #393

But since we made the guest rootfs as read-only, the above functionality was broken.
If you look at this comment, #346 (comment), the config.json needs to be written in the bundle directory which was being done by the agent. This PR attempts to fix the read-only error caused by the agent attempting to write to the bundle directory in the guest by instead having the agent writing the config.json to a directory under /run. With this though the config.json no longer exists in the bundle directory which may break OCI hooks that depend on this.
My suggestion is to instead have the runtime write the config.json to the shared bundle directory before it is passed as read-only to the guest.

The agent code scans the guest hook directory to look for hooks and then add those to the config.json for execution by libcontainer. So writing the config.json by the runtime while possible will not solve the purpose since the runtime is not scanning the guest hook directory to detect the presence of hooks.

Can we take a short term approach to write the config.json to a read-write path (like in this PR) and document it for guest hooks to make required changes. At least we'll have the guest side hook working.
In the long term we should think about a proper solution in the context of Kata-2.0 considering the existing issues and also the comments from @dgibson ?

@bergwolf @amshinde @fidencio @devimc @jodh-intel @lifupan what are your thoughts ?

@dgibson
Copy link
Contributor

dgibson commented Jul 28, 2020

I tend to agree with @bpradipt. Applying his fix makes things better in the short term, and we can address this more thoroughly in Kata 2. Putting the config.json in a different (read-write) location isn't exactly "standard" - but really very little about how the guest hooks operate is "standard", even though they have the trappings of OCI hooks.

So, let's get it working, since it has real usefulness in certain cases.

@bergwolf
Copy link
Member

@bpradipt @dgibson Yes, I agree with you. We can fix it like in this PR as a short term fix and look for a long term fix in kata 2.0.

And IMO for a long term fix to work, we should

  1. create container bundle path in a tmpfs location
  2. bind mount all rootfs and volumes to directories inside the bundle
  3. write config.json in the bundle

This not only solves the readonly rootfs problem, but also allows us to move away from the currently misused sandbox shared path, which should not be used as container rootfs destinations when shared fs is not used.

@bpradipt bpradipt closed this Jul 29, 2020
@bpradipt bpradipt reopened this Jul 29, 2020
@jodh-intel
Copy link
Contributor

@bpradipt - I think the DCO check is failing because although you have the magic SOB line, you haven't specified your name, only an email address (which could be anything).

@jodh-intel
Copy link
Contributor

@bpradipt - and apologies if you'd missed that out due to looking at:

https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#general-format

That seems to have regressed as it's "lost" the contributors name. I've raised kata-containers/community#163 so we can resolve that...

@bpradipt
Copy link
Contributor Author

@bpradipt - I think the DCO check is failing because although you have the magic SOB line, you haven't specified your name, only an email address (which could be anything).

Thanks @jodh-intel . I will re-push.

@bpradipt bpradipt force-pushed the hook-fix branch 3 times, most recently from ebedf03 to 9982e5f Compare July 30, 2020 17:37
@bpradipt
Copy link
Contributor Author

@jodh-intel no luck :-( .. Tried both the formats name <email> and name email

OCI hooks fails to run since the code was writing the config.json
to the read-only path. This patch fixes it

Fixes: #2763

Signed-off-by: Pradipta Kr. Banerjee <pradipta.banerjee@gmail.com>
@bpradipt
Copy link
Contributor Author

Finally, the author and SOB details should match

@bpradipt
Copy link
Contributor Author

/test-ubuntu

@dgibson
Copy link
Contributor

dgibson commented Aug 4, 2020

@bpradipt @dgibson Yes, I agree with you. We can fix it like in this PR as a short term fix and look for a long term fix in kata 2.0.

And IMO for a long term fix to work, we should

1. create container bundle path in a tmpfs location

2. bind mount all rootfs and volumes to directories inside the bundle

3. write `config.json` in the bundle

This not only solves the readonly rootfs problem, but also allows us to move away from the currently misused sandbox shared path, which should not be used as container rootfs destinations when shared fs is not used.

@bergwolf, I've filed an issue at kata-containers/kata-containers#485 to track this longer term work.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants