Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/1.7 backport] runtime/v2/runc: handle early exits w/o big locks #8712

Conversation

corhere
Copy link
Contributor

@corhere corhere commented Jun 19, 2023

⚠️ ran into a conflict in service.handleProcessExit / service.checkProcesses, due to commit 6f34da5 not being in the 1.7 branch.


eventSendMu is causing severe lock contention when multiple processes start and exit concurrently. Replace it with a different scheme for maintaining causality w.r.t. start and exit events for a process which does not rely on big locks for synchronization.

Keep track of all processes for which a Task(Exec)Start event has been published and have not yet exited in a map, keyed by their PID. Processing exits then is as simple as looking up which process corresponds to the PID. If there are no started processes known with that PID, the PID must either belong to a process which was started by s.Start() and before the s.Start() call has added the process to the map of running processes, or a reparented process which we don't care about. Handle the former case by having each s.Start() call subscribe to exit events before starting the process. It checks if the PID has exited in the time between it starting the process and publishing the TaskStart event, handling the exit if it has. Exit events for reparented processes received when no s.Start() calls are in flight are immediately discarded, and events received during an s.Start() call are discarded when the s.Start() call returns.

(cherry picked from commit 5cd6210)

eventSendMu is causing severe lock contention when multiple processes
start and exit concurrently. Replace it with a different scheme for
maintaining causality w.r.t. start and exit events for a process which
does not rely on big locks for synchronization.

Keep track of all processes for which a Task(Exec)Start event has been
published and have not yet exited in a map, keyed by their PID.
Processing exits then is as simple as looking up which process
corresponds to the PID. If there are no started processes known with
that PID, the PID must either belong to a process which was started by
s.Start() and before the s.Start() call has added the process to the map
of running processes, or a reparented process which we don't care about.
Handle the former case by having each s.Start() call subscribe to exit
events before starting the process. It checks if the PID has exited in
the time between it starting the process and publishing the TaskStart
event, handling the exit if it has. Exit events for reparented processes
received when no s.Start() calls are in flight are immediately
discarded, and events received during an s.Start() call are discarded
when the s.Start() call returns.

Co-authored-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
(cherry picked from commit 5cd6210)
Signed-off-by: Cory Snider <csnider@mirantis.com>
@k8s-ci-robot
Copy link

Hi @corhere. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@dcantah dcantah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM

@corhere
Copy link
Contributor Author

corhere commented Jun 29, 2023

[FAIL] [k8s.io] Container runtime should support basic operations on container [It] runtime should support listing stats for three created containers when filter is nil. [Conformance]
github.com/kubernetes-sigs/cri-tools/pkg/validate/container.go:671

The same test is also failing on the current head of the 1.7 branch, so is most likely unrelated to the changes in this PR.

Copy link
Member

@laurazard laurazard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ❤️

@dcantah
Copy link
Member

dcantah commented Jun 29, 2023

[FAIL] [k8s.io] Container runtime should support basic operations on container [It] runtime should support listing stats for three created containers when filter is nil. [Conformance]

github.com/kubernetes-sigs/cri-tools/pkg/validate/container.go:671

The same test is also failing on the current head of the 1.7 branch, so is most likely unrelated to the changes in this PR.

Yea, it's unrelated

@estesp estesp merged commit 1bd9abd into containerd:release/1.7 Jul 3, 2023
@corhere corhere deleted the backport-1.7/reduce-exec-lock-contention branch July 6, 2023 12:43
Mengkzhaoyun pushed a commit to open-beagle/containerd that referenced this pull request Aug 11, 2023
containerd 1.7.3

Welcome to the v1.7.3 release of containerd!

The third patch release for containerd 1.7 contains various fixes and updates.

* **RunC: Update runc binary to v1.1.8** ([#8843](containerd/containerd#8843))
* **CRI: Fix additionalGids: it should fallback to imageConfig.User when securityContext.RunAsUser,RunAsUsername are empty** ([#8824](containerd/containerd#8824))
* **CRI: write generated CNI config atomically** ([#8825](containerd/containerd#8825))
* **Port-Forward: Correctly handle known errors** ([#8806](containerd/containerd#8806))
* **Resolve docker.NewResolver race condition** ([#8799](containerd/containerd#8799))
* **Fix net.ipv4.ping_group_range with userns** ([#8786](containerd/containerd#8786))
* **Runtime/V2/RunC: handle early exits w/o big locks** ([#8712](containerd/containerd#8712))
* **SecComp: always allow name_to_handle_at** ([#8753](containerd/containerd#8753))
* **CRI: Windows Pod Stats: Add a check to skip stats for containers that are not running** ([#8654](containerd/containerd#8654))
* **Task: don't `close()` io before `cancel()`** ([#8658](containerd/containerd#8658))
* **Remove CNI conf_template deprecation** ([#8638](containerd/containerd#8638))
* **Fix issue for HPC pod metrics** ([#8634](containerd/containerd#8634))

See the changelog for complete list of changes

Please try out the release binaries and report any issues at
https://github.com/containerd/containerd/issues.

* Akihiro Suda
* Phil Estes
* Sebastiaan van Stijn
* Wei Fu
* Derek McGowan
* Kazuyoshi Kato
* Austin Vazquez
* Samuel Karp
* Shingo Omura
* Jin Dong
* Maksym Pavlenko
* Aditi Sharma
* Danny Canter
* James Sturtevant
* Laura Brehm
* Rodrigo Campos
* Akhil Mohan
* Andrey Epifanov
* Bjorn Neergaard
* Cory Snider
* Madhav Jivrajani
* Mahamed Ali
* Priyanka Saggu
* Qasim Sarfraz
* wangxiang
* zounengren
<details><summary>63 commits</summary>
<p>

* [release/1.7] Prepare release notes for v1.7.3 ([#8871](containerd/containerd#8871))
  * [`4cb2f1515`](containerd/containerd@4cb2f15) [release/1.7] Add release notes for v1.7.3
* [release/1.7] cri: memory.memsw.limit_in_bytes: no such file or directory ([#8869](containerd/containerd#8869))
  * [`b461ecacf`](containerd/containerd@b461eca) cri: memory.memsw.limit_in_bytes: no such file or directory
* [release/1.7] migrate to community owned bucket for node e2e tests ([#8875](containerd/containerd#8875))
  * [`14328ae03`](containerd/containerd@14328ae) migrate to community owned bucket
* [release/1.7 backport] update runc binary to v1.1.8 ([#8843](containerd/containerd#8843))
  * [`b985f7ef1`](containerd/containerd@b985f7e) update runc binary to v1.1.8
* [release/1.7 backport] [CRI] fix additionalGids: it should fallback to imageConfig.User when securityContext.RunAsUser,RunAsUsername are empty ([#8824](containerd/containerd#8824))
  * [`083f57160`](containerd/containerd@083f571) capture desc variable in range variable just in case that it run in parallel mode
  * [`a9440ce6b`](containerd/containerd@a9440ce) Use t.TempDir instead of os.MkdirTemp
  * [`eea3440d8`](containerd/containerd@eea3440) use strings.Cut instead of strings.Split for parsing imageConfig.User
  * [`eace67180`](containerd/containerd@eace671) fix userstr for dditionalGids on Linux
* [release/1.7 backport] cri: write generated CNI config atomically ([#8825](containerd/containerd#8825))
  * [`7353c0286`](containerd/containerd@7353c02) ctr: update WritePidFile to use atomicfile
  * [`ae7021300`](containerd/containerd@ae70213) shim: WritePidFile & WriteAddress use atomicfile
  * [`186eb64b7`](containerd/containerd@186eb64) cri: write generated CNI config atomically on Unix
  * [`64c3dcd8e`](containerd/containerd@64c3dcd) atomicfile: new package for atomic file writes
* [release/1.7 backport] Move logrus setup code to log package ([#8831](containerd/containerd#8831))
  * [`f7a20e17c`](containerd/containerd@f7a20e1) Move logrus setup code to log package
* [release/1.7 backport] Cirrus CI: configure apt-get to wait for locks ([#8814](containerd/containerd#8814))
  * [`60a6db9c2`](containerd/containerd@60a6db9) Cirrus CI: configure apt-get to wait for locks
* [release/1.7 backport] Update Go to 1.20.6,1.19.11 ([#8815](containerd/containerd#8815))
  * [`973778193`](containerd/containerd@9737781) Update Go to 1.20.6,1.19.11
* [release/1.7 backport] update go to go1.20.5, go1.19.10 ([#8716](containerd/containerd#8716))
  * [`403033e52`](containerd/containerd@403033e) update go to go1.20.5, go1.19.10
* [release/1.7 backport] bugfix(port-forward): Correctly handle known errors ([#8806](containerd/containerd#8806))
  * [`6b6b0c828`](containerd/containerd@6b6b0c8) bugfix(port-forward): Correctly handle known errors
* [release/1.7] Resolve docker.NewResolver race condition ([#8799](containerd/containerd#8799))
  * [`898eca21e`](containerd/containerd@898eca2) Change http.Header copy to builtin Clone
  * [`fa2efc406`](containerd/containerd@fa2efc4) Resolve docker.NewResolver race condition
* [release/1.7] Fix net.ipv4.ping_group_range with userns ([#8786](containerd/containerd#8786))
  * [`241514815`](containerd/containerd@2415148) pkg/cri/server: Test net.ipv4.ping_group_range works with userns
  * [`801e8c806`](containerd/containerd@801e8c8) pkg/cri/server: Fix net.ipv4.ping_group_range with userns
* [release/1.7 backport] vendor: github.com/containerd/zfs v1.1.0 ([#8782](containerd/containerd#8782))
  * [`d5639a5a8`](containerd/containerd@d5639a5) vendor: github.com/containerd/zfs v1.1.0
* [release/1.7 backport] ci: remove libseccomp-dev installation for nightly ([#8772](containerd/containerd#8772))
  * [`15d65709e`](containerd/containerd@15d6570) ci: remove libseccomp-dev installation for nightly
* [release/1.7] go.mod: Update cgroups to 3.0.2 ([#8769](containerd/containerd#8769))
  * [`a08ae718c`](containerd/containerd@a08ae71) [release/1.7] go.mod: Update cgroups to 3.0.2
* [release/1.7 backport] runtime/v2/runc: handle early exits w/o big locks ([#8712](containerd/containerd#8712))
  * [`18c6503d9`](containerd/containerd@18c6503) runtime/v2/runc: handle early exits w/o big locks
* [release/1.7 backport] integration/client: add timeout to `TestShimOOMScore` ([#8750](containerd/containerd#8750))
  * [`3bf3996d9`](containerd/containerd@3bf3996) integration/client: add timeout to `TestShimOOMScore`
* [release/1.7 backport] Update ginkgo to match cri-tools' version ([#8760](containerd/containerd#8760))
  * [`c2c54af9d`](containerd/containerd@c2c54af) Update ginkgo to match cri-tools' version
* [release/1.7 backport] seccomp: always allow name_to_handle_at ([#8753](containerd/containerd#8753))
  * [`6281d46df`](containerd/containerd@6281d46) seccomp: always allow name_to_handle_at
* [release/1.7] Pinned image support ([#8718](containerd/containerd#8718))
  * [`699d6701a`](containerd/containerd@699d670) Pinned image support
* [release/1.7] cherry-pick: No more nondistributable layers in MS registry ([#8690](containerd/containerd#8690))
  * [`dafbeb5b1`](containerd/containerd@dafbeb5) No more nondistributable layers in MS registry
* [release/1.7] [cri] Windows Pod Stats: Add a check to skip stats for containers that are not running ([#8654](containerd/containerd#8654))
  * [`58b6b99cd`](containerd/containerd@58b6b99) Add a check to skip stats for containers that are not running
* [release/1.7 backport] task: don't `close()` io before `cancel()` ([#8658](containerd/containerd#8658))
  * [`e5b2a0131`](containerd/containerd@e5b2a01) task: don't `close()` io before `cancel()`
* [release/1.7 backport] move to CRI-TOOLS v1.27.0 ([#8656](containerd/containerd#8656))
  * [`a6a15afe3`](containerd/containerd@a6a15af) move to CRI-TOOLS v1.27.0
* [release/1.7] Remove cni conf_template deprecation ([#8638](containerd/containerd#8638))
  * [`0b2b96479`](containerd/containerd@0b2b964) RELEASES.md: de-deprecation of CNI conf_template will be v1.7.3
  * [`a24267b28`](containerd/containerd@a24267b) Remove cni conf_template deprecation
* [release/1.7] Fix issue for HPC pod metrics ([#8634](containerd/containerd#8634))
  * [`89415fe36`](containerd/containerd@89415fe) Fix issue for HPC pod metrics
</p>
</details>
<details><summary>49 commits</summary>
<p>

* gofumpt and update status badges ([#75](containerd/zfs#75))
  * [`5e3457b`](containerd/zfs@5e3457b) TestZFSUsage: use t.TempDir()
  * [`6e9c675`](containerd/zfs@6e9c675) README: update badges
  * [`ff17a79`](containerd/zfs@ff17a79) gofmt code
* go.mod: github.com/mistifyio/go-zfs/v3 v3.0.1 ([#73](containerd/zfs#73))
  * [`d3485b9`](containerd/zfs@d3485b9) go.mod: github.com/mistifyio/go-zfs/v3 v3.0.1
* gha: fix golangci-lint, and upgrade to v1.52.2 ([#74](containerd/zfs#74))
  * [`23c831a`](containerd/zfs@23c831a) remove pre-go1.17 build-tags, and fix missing build-tags in plugin
  * [`e5acd95`](containerd/zfs@e5acd95) gha: fix golangci-lint, upgrade to v1.52.2
* Bump github.com/containerd/containerd from 1.6.12 to 1.6.18 ([#72](containerd/zfs#72))
  * [`00b96c2`](containerd/zfs@00b96c2) Bump github.com/containerd/containerd from 1.6.12 to 1.6.18
* Bump github.com/containerd/containerd from 1.6.9 to 1.6.12 ([#69](containerd/zfs#69))
  * [`a099def`](containerd/zfs@a099def) Bump github.com/containerd/containerd from 1.6.9 to 1.6.12
* Add CodeQL analysis workflow ([#67](containerd/zfs#67))
  * [`fee1db7`](containerd/zfs@fee1db7) Add CodeQL analysis workflow
* Update GitHub actions CI workflow ([#66](containerd/zfs#66))
  * [`b8b7ab2`](containerd/zfs@b8b7ab2) Update GitHub actions CI workflow
* Upgrade compiler to Go 1.19 and update dependencies ([#68](containerd/zfs#68))
  * [`3e729b3`](containerd/zfs@3e729b3) Update dependencies
  * [`3c003f8`](containerd/zfs@3c003f8) Upgrade compiler to Go 1.19
* Remove references to io/ioutil package ([#65](containerd/zfs#65))
  * [`d700762`](containerd/zfs@d700762) Remove references to io/ioutil package
* Update go.mod and move to supported Go version ([#62](containerd/zfs#62))
  * [`f52906e`](containerd/zfs@f52906e) Update Go version to supported version
  * [`79ca2cb`](containerd/zfs@79ca2cb) Update containerd depedency to latest
* go.mod: github.com/mistifyio/go-zfs v3.0.0 ([#59](containerd/zfs#59))
  * [`2e3db29`](containerd/zfs@2e3db29) go.mod: github.com/mistifyio/go-zfs v3.0.0
* go.mod: github.com/mistifyio/go-zfs/v3 v3.0.0-20220217145925-d014733a5309 ([#58](containerd/zfs#58))
  * [`d904e63`](containerd/zfs@d904e63) go.mod: github.com/mistifyio/go-zfs/v3 v3.0.0-20220217145925-d014733a5309
* Update vendoring to containerd 1.6.x ([#57](containerd/zfs#57))
  * [`e021180`](containerd/zfs@e021180) Update vendoring to containerd 1.6.x
* Bump github.com/containerd/containerd from 1.5.8 to 1.5.9 ([#55](containerd/zfs#55))
  * [`fc0c9a9`](containerd/zfs@fc0c9a9) Bump github.com/containerd/containerd from 1.5.8 to 1.5.9
* Bump github.com/containerd/containerd from 1.5.5 to 1.5.8 ([#54](containerd/zfs#54))
  * [`5d2f28c`](containerd/zfs@5d2f28c) Bump github.com/containerd/containerd from 1.5.5 to 1.5.8
* follow-up-#52: fix the order of cause in fmt.Errorf ([#53](containerd/zfs#53))
  * [`b3f193d`](containerd/zfs@b3f193d) follow-up-#52: fix the order of cause in fmt.Errorf
* replace pkg/errors ([#52](containerd/zfs#52))
  * [`d5b0a2f`](containerd/zfs@d5b0a2f) replace pkg/errors
* Bump github.com/containerd/containerd from 1.5.2 to 1.5.4 ([#51](containerd/zfs#51))
  * [`fd6afa5`](containerd/zfs@fd6afa5) Bump github.com/containerd/containerd from 1.5.2 to 1.5.4
* Bump containerd to 1.5.2 ([#50](containerd/zfs#50))
  * [`aef875e`](containerd/zfs@aef875e) bump containerd to 1.5.2
* Rename branches from master to main ([#49](containerd/zfs#49))
  * [`35c6af7`](containerd/zfs@35c6af7) Rename branches from master to main
* sync up with containerd 1.5 GA  ([#47](containerd/zfs#47))
  * [`3d5efef`](containerd/zfs@3d5efef) vendor sync up with containerd 1.5 ga
* README.md: fix CI badge ([#46](containerd/zfs#46))
  * [`0977d81`](containerd/zfs@0977d81) README.md: fix CI badge
</p>
</details>

* **github.com/containerd/cgroups/v3**  v3.0.1 -> v3.0.2
* **github.com/containerd/zfs**         v1.0.0 -> v1.1.0
* **github.com/mistifyio/go-zfs/v3**    v3.0.1 **_new_**

Previous release can be found at [v1.7.2](https://github.com/containerd/containerd/releases/tag/v1.7.2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants