Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handling of cgroup driver setting #864

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

z63d
Copy link
Contributor

@z63d z63d commented Feb 19, 2025

fix: #821

Fixes an issue where cgroupfs was always used as the cgroup driver.
Receives SystemdCgroup from containerd and allows you to configure the cgroup driver. (There is a deserialize issue, so we'll use a regular expression to work around it.)
Uses systemd as the default cgroup driver.

@Mossaka
Copy link
Member

Mossaka commented Feb 19, 2025

It looks like one of the test failed with the following error:

Error: failed to stop unit youki-test.scope: dbus error: dbus function call error: Unit youki-test.scope not loaded.

Caused by:
    failed to stop unit youki-test.scope: dbus error: dbus function call error: Unit youki-test.scope not loaded.

Stack backtrace:
   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
   1: <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual
   2: containerd_shim_wasm::testing::WasiTest<WasiInstance>::wait
   3: containerd_shim_wasmedge::wasmedge_tests::test_seccomp::{{closure}}
   4: core::ops::function::FnOnce::call_once
   5: serial_test::serial_code_lock::local_serial_core_with_return
   6: containerd_shim_wasmedge::wasmedge_tests::test_seccomp
   7: containerd_shim_wasmedge::wasmedge_tests::test_seccomp::{{closure}}
   8: core::ops::function::FnOnce::call_once
   9: test::__rust_begin_short_backtrace
  10: test::run_test::{{closure}}
  11: std::sys::backtrace::__rust_begin_short_backtrace
  12: core::ops::function::FnOnce::call_once{{vtable.shim}}
  13: std::sys::pal::unix::thread::Thread::new::thread_start
test wasmedge_tests::test_seccomp ... FAILED
test wasmedge_tests::test_unreachable ... ok

failures:

failures:
    wasmedge_tests::test_seccomp

@z63d z63d force-pushed the fix/cgroup-driver branch from 79004b2 to b7ba5e9 Compare February 19, 2025 09:16
@z63d
Copy link
Contributor Author

z63d commented Feb 19, 2025

hmm...
This seems to have been a temporary issue. I ran it again and it passed without any issues.

But other problems are occurring...
This did not happen when I was testing with K3s.

  Warning  FailedCreatePodSandBox  6s (x21 over 4m47s)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: failed to create container: intermediate process error cgroup error: systemd error: dbus error: dbus connection error: ENOENT: No such file or directory. error during cleanup: systemd error: dbus error: dbus connection error: ENOENT: No such file or directory: unknown

@@ -17,5 +17,5 @@ ADD dist/bin/* /usr/local/bin/
ARG RUNTIME
RUN cat <<EOF >> /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.wasm]
runtime_type = "io.containerd.${RUNTIME}.v1"
runtime_path = "io.containerd.${RUNTIME}.v1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fully follow the discussion in Slack about this, but I would expect runtime_path to be a path, while this is not a path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the kind test fails.
This is what I wanted to do. Set runtime_type to path.
Is this difference okay?

runtime_type = "$PWD/dist/bin/containerd-shim-$1-v1"

runtime_type = "io.containerd.${RUNTIME}.v1"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is runtime_type a path in k3s but not in k8s? I am confused.

Should runtime_type always be in the form of io.containerd.${RUNTIME}.{version} for k8s and k3s?

@z63d z63d force-pushed the fix/cgroup-driver branch from 1a6b2a1 to 4665961 Compare February 19, 2025 12:42
@github-actions github-actions bot removed the T-CI label Feb 19, 2025
Comment on lines 103 to 145
// Deserialization in the following way doesn't work.
// https://github.com/containerd/rust-extensions/blob/shim-v0.8.0/crates/runc-shim/src/runc.rs#L85-L88
// Because req.options gives us the following string: "\u{1a}\u{15}SystemdCgroup = true\n"
// So use regex to work around this issue.
fn parse_systemd_cgroup(input: &str) -> Option<bool> {
let regex: Regex = Regex::new(r#"\u{1a}\u{15}SystemdCgroup = (true|false)\n"#).unwrap();
if let Some(captures) = regex.captures(input) {
if let Some(value_match) = captures.get(1) {
let value_str = value_match.as_str();
return Some(value_str.parse().unwrap_or(true));
}
}
None
}
let systemd_cgroup = match req.options.as_ref() {
Some(any) => match String::from_utf8(any.value.to_vec()) {
Ok(opts_str) => parse_systemd_cgroup(opts_str.as_str()).unwrap_or(true),
Err(_) => true,
},
None => true,
};

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind trying this?
It needs 2 new dependencies

prost = "0.13"
toml = "0.8"
        #[derive(Deserialize, Default, Clone, PartialEq)]
        struct Config {
            #[serde(alias = "SystemdCgroup")]
            systemd_cgroup: bool,
        }
        
        fn get_config(options: Option<&Any>) -> anyhow::Result<Config> {
            let Some(opts) = options else {
                return Ok(Default::default());
            };

            ensure!(
                opts.type_url == "runtimeoptions.v1.Options",
                "Invalid options type {}",
                opts.type_url
            );

            #[derive(Message, Clone, PartialEq)]
            struct Options {
                #[prost(string)]
                type_url: String,
                #[prost(string)]
                config_path: String,
                #[prost(string)]
                config_body: String,
            }

            let opts = Options::decode(opts.value.as_slice())?;

            let config = toml::from_str(opts.config_body.as_str()).map_err(|err| {
                Error::InvalidArgument(format!("invalid shim options: {err}"))
            })?;

            Ok(config)
        }

        let config = get_config(req.options.as_ref()).map_err(|err| {
            Error::InvalidArgument(format!("invalid shim options: {err}"))
        })?;
        let systemd_cgroup = config.systemd_cgroup;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Works perfectly!
Would it be better to separate theOptions and Config logic into different files?

@z63d z63d force-pushed the fix/cgroup-driver branch 3 times, most recently from 4665961 to 79bebfc Compare February 20, 2025 13:15
@github-actions github-actions bot removed the T-tests label Feb 20, 2025
@z63d z63d force-pushed the fix/cgroup-driver branch from 79bebfc to 4764514 Compare February 20, 2025 13:20
@z63d
Copy link
Contributor Author

z63d commented Feb 20, 2025

This error occurs randomly.
It doesn't seem to be a specific test.
Although the probability is low, there are cases where it does not occur.

Error: failed to stop unit youki-test.scope: dbus error: dbus function call error: Unit youki-test.scope not loaded.

Caused by:
    failed to stop unit youki-test.scope: dbus error: dbus function call error: Unit youki-test.scope not loaded.

Probably this happens too.
I followed the Makefile but it doesn't work locally so I haven't tried it yet.
https://github.com/containerd/runwasi/actions/runs/13409133082/job/37455738328

hmm... This seems to have been a temporary issue. I ran it again and it passed without any issues.

But other problems are occurring... This did not happen when I was testing with K3s.

  Warning  FailedCreatePodSandBox  6s (x21 over 4m47s)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: failed to create container: intermediate process error cgroup error: systemd error: dbus error: dbus connection error: ENOENT: No such file or directory. error during cleanup: systemd error: dbus error: dbus connection error: ENOENT: No such file or directory: unknown

@jprendes
Copy link
Collaborator

jprendes commented Feb 21, 2025

I am confused.

From what you observe, it looks like the options have indeed type_url = runtimeoptions.v1.Options.
https://cloud-native.slack.com/archives/C04LTPB6Z0V/p1739716368576559?thread_ts=1738110849.528919&cid=C04LTPB6Z0V
https://github.com/containerd/containerd/blob/main/api/types/runtimeoptions/v1/api.proto

But it also looks like the runc shim is not expecting an option like that, but rather a runc.options.Options.
https://github.com/containerd/containerd/blob/main/cmd/containerd-shim-runc-v2/runc/container.go#L53-L62
https://github.com/containerd/containerd/blob/main/api/types/runc/options/oci.proto

I don't understand why we receive different type of options, or how to tell to containerd the type of protobuf message that we are expecting (i.e., our own runwasi.options.Options)

Also it looks like containerd sends us the TaskOptions, and if that's nil, it sends the RuntimeOptions instead.
Maybe that explains the difference?
https://github.com/containerd/containerd/blob/main/core/runtime/v2/shim.go#L561-L564

@cpuguy83 are you familiar with any of this?

impl Default for Config {
fn default() -> Self {
Self {
systemd_cgroup: true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC what runc is doing, I think this should default to false.
Also, I think defaulting this to false would fix a few of the CI failures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set it to false, most k8s clusters will have metrics bugs unles explicitly specify SystemdCgroup = true in containerd/config.toml.
Is this OK?

Also, I think defaulting this to false would fix a few of the CI failures.

I think the kind problem will be solved.
But when using systemd cgroups, kind pod doesn't seem to work.
As you can see below, kind supports systemd cgroups. It's strange that it doesn't work.
#356 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this OK?

IIUC, that would be the same behaviour as runc, so it´s fine with me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think it's better to err on the side of not using systemd.
If we wrongly DON'T use systemd, the metrincs don't work.
If we wrongly DO use systemd, the container doesn't even start.

@z63d z63d force-pushed the fix/cgroup-driver branch 4 times, most recently from 660c53b to 9191061 Compare February 21, 2025 13:07
@cpuguy83
Copy link
Member

I am confused.

From what you observe, it looks like the options have indeed type_url = runtimeoptions.v1.Options. https://cloud-native.slack.com/archives/C04LTPB6Z0V/p1739716368576559?thread_ts=1738110849.528919&cid=C04LTPB6Z0V https://github.com/containerd/containerd/blob/main/api/types/runtimeoptions/v1/api.proto

But it also looks like the runc shim is not expecting an option like that, but rather a runc.options.Options. https://github.com/containerd/containerd/blob/main/cmd/containerd-shim-runc-v2/runc/container.go#L53-L62 https://github.com/containerd/containerd/blob/main/api/types/runc/options/oci.proto

I don't understand why we receive different type of options, or how to tell to containerd the type of protobuf message that we are expecting (i.e., our own runwasi.options.Options)

Also it looks like containerd sends us the TaskOptions, and if that's nil, it sends the RuntimeOptions instead. Maybe that explains the difference? https://github.com/containerd/containerd/blob/main/core/runtime/v2/shim.go#L561-L564

@cpuguy83 are you familiar with any of this?

@jprendes This is due to this code: https://github.com/containerd/containerd/blob/6f652853f01ef9ba340a860c2f39edf1701102d1/internal/cri/config/config.go#L658-L667

The runwasi shim is not of type io.containerd.runc.v2 so it falls back to the raw options.

@z63d z63d force-pushed the fix/cgroup-driver branch from 9191061 to cb21dc5 Compare February 22, 2025 05:12
Signed-off-by: Kaita Nakamura <kaita.nakamura0830@gmail.com>
@z63d z63d force-pushed the fix/cgroup-driver branch from cb21dc5 to 8b2e146 Compare February 22, 2025 05:41
@z63d
Copy link
Contributor Author

z63d commented Feb 22, 2025

@Mossaka @jprendes
I think CI started failing after the stress test PR was merged.
Do you know why?
https://github.com/containerd/runwasi/actions/runs/13470105500/job/37642539745?pr=864

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pod level metrics are not being surfaced
4 participants