-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix runc's poststart behaviour doesn't match the runtime-spec #4348
base: main
Are you sure you want to change the base?
Conversation
6ffc50a
to
307c2b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code should be moved to container.Exec
(keep it inside libcontainer
).
Please write a more detailed commit message. "fix poststart error" is not a good one. |
307c2b6
to
a78a083
Compare
done.thank you. please review my pr. @kolyshkin @cyphar |
c6c42bc
to
b766494
Compare
b766494
to
db988e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a test case to avoid the similar error in future.
You can see:
#4323 (comment)
return err | ||
} | ||
if c.config.Hooks != nil { | ||
if err := c.config.Hooks.Run(configs.Poststart, s); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the runtime-spec said:
The poststart hooks MUST be invoked by the runtime. If any poststart hook fails, the runtime MUST log a warning, but the remaining hooks and lifecycle continue as if the hook had succeeded.
So maybe this is another bug, we should write a new function to run poststart and poststop hooks.
@kolyshkin WDYT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If yes, please help to open a new issue to track this bug, thanks.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the current code logs a warning (like the spec says) but also terminates container's init and returns an error (unlike the spec says).
I'm not sure whether we should change the spec or runc behavior. In theory spec should be left as is, and runc should be changed to follow the spec. In practice, though, changing runc might result in backward incompatibilities and complexities in upper runtime. So yes, we have a choice to make here.
It's complicated 😿
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I researched the history a little bit.
- The initial implementation of poststart hooks was done by @mrunalp in 2015 (commit 452e8a7, PR Add poststart hooks #392, merged Nov 2015). If a hook was returning an error, the start was terminated and the error was returned.
- The "start must continue even if poststart hook failed" clause was added to runtime-spec by @RenaudWasTaken in Add create-runtime, create-container, start-container hooks runtime-spec#1008 (merged Dec 30 2019).
- More hooks were added to runc by @RenaudWasTaken (commit ccdd757, PR Add CreateRuntime, CreateContainer and StartContainer Hooks #2229, merged Jun 2020). The same commit touched the poststart code but didn't change it to not return an error if a hook failed.
In my mind, this is the spec that needs to be changed. The reasons are:
- initial implementation predates the spec wording by a few years;
- the wording in the spec was never implemented in runc;
- returning an error (and stopping the container) seems like a more versatile approach (a hook can choose whether to return an error or not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to stop container by using SIGTERM ,SIGKILL or both use?
db988e0
to
0f17a69
Compare
0f17a69
to
3d9a118
Compare
already backport #4351 |
The commit description still doesn't say what it does and why. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR removes the part where the container is killed when a hook fails. This should have been caught by an integration test, alas, it doesn't check that.
Opened #4352
d2fac01
to
adcf828
Compare
5d066a8
to
df2c44a
Compare
df2c44a
to
5a2aeea
Compare
Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
5a2aeea
to
9bfaca5
Compare
Moved the milestone to 1.3.0. Dealing with this is going to take a while, and we should just get 1.2.0 out (this is a pre-existing issue anyway). |
fix:#4347