-
Notifications
You must be signed in to change notification settings - Fork 113
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @bpradipt - I have some questions
da11c77
to
eb36f82
Compare
9611a94
to
0d2b88f
Compare
Codecov Report
@@ Coverage Diff @@
## master #798 +/- ##
==========================================
+ Coverage 60.20% 60.94% +0.73%
==========================================
Files 17 17
Lines 2656 2993 +337
==========================================
+ Hits 1599 1824 +225
- Misses 896 1003 +107
- Partials 161 166 +5 |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
However, I'd not merge till we also have @devimc's input (as Julio was part of the first loop(s)).
// writeSpecToFile writes the container's OCI spec to "/run/libcontainer/<container-id>/config.json" | ||
// Note that the OCI bundle (rootfs) is at a different path | ||
func writeSpecToFile(spec *specs.Spec, containerId string) error { | ||
configJsonDir := filepath.Join(ociConfigBasePath, containerId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the container terminated, was the file be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm not sure. However given that there could be poststop
hooks which will run after the container has stopped, the config json might not be removed by the runtime. @bergwolf would you have any idea ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do the cleanup in function: RemoveContainer https://github.com/kata-containers/agent/blob/master/grpc.go#L1252
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that the /run/
folder is a tmpfs mount. The config.json should get removed automatically along with other files that are there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's only when the pod is stopped. But if the pod is still live and users create/stop many containers, then there would be much unused spec files left there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the scenario now. Thanks @lifupan. So in this case should we also handle cleanup of state.json that is kept under /run/libcontainer/<id>
? And if yes, does it make sense to handle cleanup in a separate patch or club with this one ?
Also the impact of removing config.json on container poststop
hook needs to be checked. Let me know your thoughts @lifupan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check if libcontainer
removes /run/libocntainer/<id>
entirely upon container removal? If so, there is nothing to be done in the agent code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bpradipt Lets make sure we clean up correctly in this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bpradipt @bergwolf The dir /run/libocntainer/ was created by this patch in agent, I don't think libcontainer would do cleanup of this dir. BTW, the libcontainer would run he poststophooks in container's Destroy function at https://github.com/kata-containers/agent/blob/master/grpc.go#L1264 or https://github.com/kata-containers/agent/blob/master/grpc.go#L1279, so you can just do the cleanup behind that destroy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @bpradipt
/test-ubuntu |
/test-wip-check |
I have made the changes to remove the directory post container stop. However I noticed one oddity which I had missed earlier. Semantically the hooks expect @bergwolf @lifupan @fidencio @devimc let me know what you think and hope I'm not missing something fundamental here. |
grpc.go
Outdated
@@ -1273,6 +1273,9 @@ func (a *agentGRPC) RemoveContainer(ctx context.Context, req *pb.RemoveContainer | |||
} | |||
} | |||
} | |||
configJsonDir := filepath.Join("/run/libcontainer/", req.ContainerId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bpradipt
Here you just do the cleanup in case of the timeout == 0, how about
doing the cleanup at the bottom of this function or put it a delay function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @lifupan yeah sure. I'll make the changes and test it out. Can you tell me what's the timeout
is for? I didn't hit the else path during my local testing.
@bpradipt Thanks for tackling this. This functionality was broken for a while since we moved to a read-only rootfs. We do not have CI today to catch this. (FYI : Discussion that we had on the original PR : #346 (comment) ) |
@bpradipt Would also like to see an integration test for this being added, this could just be a dummy oci compliant hooks binary, that reads the config.json to get the path to the rootfs and perform a simple mount in the rootfs. |
@amshinde There would be some changes to spec in the guest such as cpuset reset, if the hooks didn't care about this changes, I think writing from the host side is reasonable. |
@bergwolf this is for OCI hooks running inside the Kata VM (https://github.com/kata-containers/runtime/blob/master/cli/config/configuration-qemu.toml.in#L288). |
As @bpradipt This PR is for running the OCI hooks inside the guest. We run hooks passed in the OCI spec on the host, but we added this functionality to support use cases which require certain hooks to be run inside the guest, for eg in case of Nvidia GPU plugin. The hooks need to be part of the guest rootfs. We added this functionality as part of this PR : #393 But since we made the guest rootfs as read-only, the above functionality was broken. |
I have a whole batch of thoughts on this.
Having guest side hooks is certainly useful for cases, but I think we should treat (and name) them as a kata specific thing, not pretend they have any real connection to OCI hooks. (The fact that they're implemented via OCI hooks executed on the "inner" container is an implementation detail that shouldn't be exposed in naming and configuration).
|
Of course all those thoughts are for the medium to long term. Short term is another question, which I also don't have any clear idea on. |
The agent code scans the guest hook directory to look for hooks and then add those to the config.json for execution by libcontainer. So writing the config.json by the runtime while possible will not solve the purpose since the runtime is not scanning the guest hook directory to detect the presence of hooks. Can we take a short term approach to write the config.json to a read-write path (like in this PR) and document it for guest hooks to make required changes. At least we'll have the guest side hook working. @bergwolf @amshinde @fidencio @devimc @jodh-intel @lifupan what are your thoughts ? |
I tend to agree with @bpradipt. Applying his fix makes things better in the short term, and we can address this more thoroughly in Kata 2. Putting the config.json in a different (read-write) location isn't exactly "standard" - but really very little about how the guest hooks operate is "standard", even though they have the trappings of OCI hooks. So, let's get it working, since it has real usefulness in certain cases. |
@bpradipt @dgibson Yes, I agree with you. We can fix it like in this PR as a short term fix and look for a long term fix in kata 2.0. And IMO for a long term fix to work, we should
This not only solves the readonly rootfs problem, but also allows us to move away from the currently misused sandbox shared path, which should not be used as container rootfs destinations when shared fs is not used. |
@bpradipt - I think the DCO check is failing because although you have the magic SOB line, you haven't specified your name, only an email address (which could be anything). |
@bpradipt - and apologies if you'd missed that out due to looking at: https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#general-format That seems to have regressed as it's "lost" the contributors name. I've raised kata-containers/community#163 so we can resolve that... |
Thanks @jodh-intel . I will re-push. |
ebedf03
to
9982e5f
Compare
@jodh-intel no luck :-( .. Tried both the formats |
OCI hooks fails to run since the code was writing the config.json to the read-only path. This patch fixes it Fixes: #2763 Signed-off-by: Pradipta Kr. Banerjee <pradipta.banerjee@gmail.com>
Finally, the author and SOB details should match |
/test-ubuntu |
@bergwolf, I've filed an issue at kata-containers/kata-containers#485 to track this longer term work. |
OCI hooks fails to run since the code was writing the config.json
to the read-only path. This patch fixes it
Fixes: #2763
Signed-off-by: Pradipta Kumar bpradipt@in.ibm.com