Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetCpuLoad Failed on cgroup v2 environment #3137

Open
BSWANG opened this issue Jul 19, 2022 · 3 comments
Open

GetCpuLoad Failed on cgroup v2 environment #3137

BSWANG opened this issue Jul 19, 2022 · 3 comments

Comments

@BSWANG
Copy link

BSWANG commented Jul 19, 2022

How to reproduce it?

Start cadvisor with -enable_load_reader on the cgroup v2 environment.

Will see following error message during housekeeping:

W0719 17:10:36.691918 3600017 container.go:589] Failed to update stats for container "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podee447a3b_5409_4bef_afd2_64b7f7835849.slice": failed to get load stat for "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podee447a3b_5409_4bef_afd2_64b7f7835849.slice" - path "/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podee447a3b_5409_4bef_afd2_64b7f7835849.slice", error netlink request failed with error invalid argument

What I found after investigate:

The cadvisor get cpu load info by send CGROUPSTATS_CMD_GET request though netlink message:

func prepareCmdMessage(id uint16, cfd uintptr) (msg netlinkMessage) {
buf := bytes.NewBuffer([]byte{})
addAttribute(buf, unix.CGROUPSTATS_CMD_ATTR_FD, uint32(cfd), 4)
return prepareMessage(id, unix.CGROUPSTATS_CMD_GET, buf.Bytes())
}

The kernel process the get request in cgroupstats_user_cmd :

https://github.com/torvalds/linux/blob/master/kernel/taskstats.c#L407

static int cgroupstats_user_cmd(struct sk_buff *skb, struct genl_info *info)

And build the result in cgroupstats_build:
https://github.com/torvalds/linux/blob/5c1ee569660d4a205dced9cb4d0306b907fb7599/kernel/cgroup/cgroup-v1.c#L699

int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
{
……
	/* it should be kernfs_node belonging to cgroupfs and is a directory */
	if (dentry->d_sb->s_type != &cgroup_fs_type || !kn ||
	    kernfs_type(kn) != KERNFS_DIR)
		return -EINVAL;  // which cause EINVAL code return 

The cgroup_fs_type is type of cgroup v1 not cgroup v2. So cgroupstats_build function return EINVAL on path type judge statement.

Same issue on kernel community: https://lore.kernel.org/all/20200910055207.87702-1-zhouchengming@bytedance.com/T/#r50c826a171045e42d0b40a552e0d4d1b2a2bab4d

How to resolve it?

Maybe we can obtain the cpu stat from cpu.pressure file when psi is enabled, not from the CGROUPSTATS_CMD_GET netlink api?

@BSWANG
Copy link
Author

BSWANG commented Jul 26, 2022

@ShadowJonathan
Copy link

So does this issue effectively mean that cpuLoad is not working on systems with cgroup v2?

@ShadowJonathan
Copy link

PSI is discussed in #3052

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants