Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix rest of the comments #37

Open
wants to merge 404 commits into
base: psi
Choose a base branch
from
Open

fix rest of the comments #37

wants to merge 404 commits into from

Conversation

szuecs
Copy link

@szuecs szuecs commented Nov 4, 2022

Signed-off-by: Sandor Szücs sandor.szuecs@zalando.de

thaJeztah and others added 17 commits November 8, 2022 22:32
….1.0

go.mod: golang.org/x/*: use tagged versions
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.1.0 to 0.2.0.
- [Release notes](https://github.com/golang/sys/releases)
- [Commits](golang/sys@v0.1.0...v0.2.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
…/go_modules/golang.org/x/sys-0.2.0

build(deps): bump golang.org/x/sys from 0.1.0 to 0.2.0
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.1.0 to 0.2.0.
- [Release notes](https://github.com/golang/net/releases)
- [Commits](golang/net@v0.1.0...v0.2.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: yaozhenxiu <946666800@qq.com>
Signed-off-by: Jonas Eschenburg <jonas.eschenburg@kuka.com>
Signed-off-by: Jonas Eschenburg <jonas.eschenburg@kuka.com>
Init State Error message was using the err variable instead of uerr, which has been fixed now.
The error message should not show "nil" now.

Signed-off-by: Vipul Newaskar <vipulnewaskar7@gmail.com>
…/go_modules/golang.org/x/net-0.2.0

build(deps): bump golang.org/x/net from 0.1.0 to 0.2.0
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
…-error-variable-fix

Fixed Init State Error Variable
Vagrantfile.fedora: upgrade Fedora to 37
Fix a few copy-paste errors.

Fixes: 520702d
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Amend runc features to print seccomp flags. Two set of flags are added:
 * known flags are those that this version of runc is aware of;
 * supported flags are those that can be set; normally, this is the same
   set as known flags, but due to older version of kernel and/or
   libseccomp, some known flags might be unsupported.

This commit also consolidates three different switch statements dealing
with flags into one, in func setFlag. A note is added to this function
telling what else to look for when adding new flags.

Unfortunately, it also adds a list of known flags, that should be
kept in sync with the switch statement.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
If no seccomps flags are set in OCI runtime spec (not even the empty
set), set SPEC_ALLOW as the default (if it's supported).

Otherwise, use the flags as they are set (that includes no flags for
empty seccomp.Flags array).

This mimics the crun behavior, and makes runc seccomp performance on par
with crun.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This test (initially added by commit 58ea21d and later amended in
commit 26dc55e) currently has two major deficiencies:

1. All possible flag combinations, and their respective numeric values,
   have to be explicitly listed. Currently we support 3 flags, so
   there is only 2^3 - 1 = 7 combinations, but adding more flags will
   become increasingly difficult (for example, 5 flags will result in
   31 combinations).

2. The test requires kernel 4.17 (for SECCOMP_FILTER_FLAG_SPEC_ALLOW),
   and not doing any tests when running on an older kernel. This, too,
   will make it more difficult to add extra flags in the future.

Both issues can be solved by using runc features which now prints all
known and supported runc flags. We still have to hardcode the numeric
values of all flags, but most of the other work is coded now.

In particular:

 * The test only uses supported flags, meaning it can be used with
   older kernels, removing the limitation (2) above.

 * The test calculates the powerset (all possible combinations) of
   flags and their numeric values. This makes it easier to add more
   flags, removing the limitation (1) above.

 * The test will fail (in flags_value) if any new flags will be added
   to runc but the test itself is not amended.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
dependabot bot and others added 10 commits December 8, 2022 04:04
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.2.0 to 0.4.0.
- [Release notes](https://github.com/golang/net/releases)
- [Commits](golang/net@v0.2.0...v0.4.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
…/go_modules/golang.org/x/net-0.4.0

build(deps): bump golang.org/x/net from 0.2.0 to 0.4.0
notify_socket.go: use sd_notify_barrier mechanism
Merge the logic of setPageServer, setManageCgroupsMode, and
setEmptyNsMask into criuOptions. This does three things:

1. Fixes ignoring --manage-cgroups-mode on restore;
2. Simplifies the code in checkpoint.go and restore.go;
3. Ensures issues like 1 won't happen again.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
 - add the new mode and document it;
 - slightly improve the --help output;
 - slightly simplify the parsing code.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When manage-cgroups-mode: ignore is used, criu still needs to know the
cgroup path to work properly (see [1]).

Revert "libct/criuApplyCgroups: don't set cgroup paths for v2"

This reverts commit d5c57dc.

[1]: checkpoint-restore/criu#1793 (comment)

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
I don't want to implement it now, because this might result in some
new issues, but this is definitely something that is worth implementing.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This test checks that the container is restored into a different cgroup.

To do so, a user should
 - use --manage-cgroups-mode ignore on both checkpoint and restore;
 - change the cgroupsPath value in config.json before restoring.

The test does some checks to ensure that its logic is correct, and that
after the restore the old (original) cgroup does not exist, the new one
exists, and the container's init is in that new cgroup.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When doing a lazy checkpoint/restore, we should not restore into the
same cgroup, otherwise there is a race which result in occasional
killing of the restored container (GH opencontainers#2760, opencontainers#2924).

The fix is to use --manage-cgroup-mode=ignore, which allows to restore
into a different cgroup.

Note that since cgroupsPath is not set in config.json, the cgroup is
derived from the container name, so calling set_cgroups_path is not
needed.

For the previous (unsuccessful) attempt to fix this, as well as detailed
(and apparently correct) analysis, see commit 36fe3cc.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin and others added 29 commits June 7, 2023 14:39
tests/int: increase num retries for oom tests
It seems that set -x was temporarily added as a debug measure, but
slipped into the final commit.

Remove it, for the sake of test logs brevity.

Fixes: 9f656db
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is roughly the same as TestPIDHostInitProcessWait in libct/int,
except that here we use separate processes to create and to kill a
container, so the processes inside a container are not children of "runc kill", and
also we hit different codepaths (nonChildProcess.signal rather than
initProcess.signal).

One other thing is, rootless is also tested.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
There are two very distinct usage scenarios for signalAllProcesses:

* when used from the runc binary ("runc kill" command), the processes
  that it kills are not the children of "runc kill", and so calling
  wait(2) on each process is totally useless, as it will return ECHLD;

* when used from a program that have created the container (such as
  libcontainer/integration test suite), that program can and should call
  wait(2), not the signalling code.

So, the child reaping code is totally useless in the first case, and
should be implemented by the program using libcontainer in the second
case. I was not able to track down how this code was added, my best
guess is it happened when this code was part of dockerd, which did not
have a proper child reaper implemented at that time.

Remove it, and add a proper documentation piece.

Change the integration test accordingly.

PS the first attempt to disable the child reaping code in
signalAllProcesses was made in commit bb912eb, which used a
questionable heuristic to figure out whether wait(2) should be called.
This heuristic worked for a particular use case, but is not correct in
general.

While at it:
 - simplify signalAllProcesses to use unix.Kill;
 - document (container).Signal.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When someone is using libcontainer to start and kill containers from a
long lived process (i.e. the same process creates and removes the
container), initProcess.wait method is used, which has a kludge to work
around killing containers that do not have their own PID namespace.

The code that checks for own PID namespace is not entirely correct.
To be exact, it does not set sharePidns flag when the host/caller PID
namespace is implicitly used. As a result, the above mentioned kludge
does not work.

Fix the issue, add a test case (which fails without the fix).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
By default, the container has its own PID namespace, and killing (with
SIGKILL) its init process from the parent PID namespace also kills all
the other processes.

Obviously, it does not work that way when the container is sharing its
PID namespace with the host or another container, since init is no
longer special (it's not PID 1). In this case, killing container's init
will result in a bunch of other processes left running (and thus the
inability to remove the cgroup).

The solution to the above problem is killing all the container
processes, not just init.

The problem with the current implementation is, the killing logic is
implemented in libcontainer's initProcess.wait, and thus only available
to libcontainer users, but not the runc kill command (which uses
nonChildProcess.kill and does not use wait at all). So, some workarounds
exist:
 - func destroy(c *Container) calls signalAllProcesses;
 - runc kill implements -a flag.

This code became very tangled over time. Let's simplify things by moving
the killing all processes from initProcess.wait to container.Signal,
and documents the new behavior.

In essence, this also makes `runc kill` to automatically kill all container
processes when the container does not have its own PID namespace.
Document that as well.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
As of previous commit, this is implied in a particular scenario. In
fact, this is the one and only scenario that justifies the use of -a.

Drop the option from the documentation. For backward compatibility, do
recognize it, and retain the feature of ignoring the "container is
stopped" error when set.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Bumps [github.com/sirupsen/logrus](https://github.com/sirupsen/logrus) from 1.9.2 to 1.9.3.
- [Release notes](https://github.com/sirupsen/logrus/releases)
- [Changelog](https://github.com/sirupsen/logrus/blob/master/CHANGELOG.md)
- [Commits](sirupsen/logrus@v1.9.2...v1.9.3)

---
updated-dependencies:
- dependency-name: github.com/sirupsen/logrus
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
…/go_modules/github.com/sirupsen/logrus-1.9.3
Filter out rdma controller since systemd is unable to delegate it.
Similar to commits 0527271 and 601cf58.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Rename CGROUP_PATH to CGROUP_V2_PATH so it is more clear that it can
only be used for CGROUP_V2, and to resolve ambiguity with CGROUP_PATH
variable used in tests/rootless.sh.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
We were not running localrootlessintegration test on CentOS Stream 9
because of some failures fixed by previous commits.

Enable rootless integration with both systemd and fs drivers.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Kir Kolyshkin (7):
  libct: implement support for cgroup.kill
  runc kill: drop -a option
  libct: move killing logic to container.Signal
  libct: fix shared pidns detection
  libct: signalAllProcesses: remove child reaping
  tests/int/kill: add kill -a with host pidns test
  tests/rootless.sh: drop set -x

LGTMs: AkihiroSuda cyphar
Closes opencontainers#3825
ci/cirrus: enable some rootless tests on cs9
We read output from the following files if they exists:
- cpu.pressure
- memory.pressure
- io.pressure

Each are in format:

```
some avg10=0.00 avg60=0.00 avg300=0.00 total=0
full avg10=0.00 avg60=0.00 avg300=0.00 total=0
```

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
Signed-off-by: Daniel Dao <dqminh89@gmail.com>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.