Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

Crash in scx_cgroup_can_attach() when testing scx_lavd #164

Closed
Byte-Lab opened this issue Mar 18, 2024 · 10 comments
Closed

Crash in scx_cgroup_can_attach() when testing scx_lavd #164

Byte-Lab opened this issue Mar 18, 2024 · 10 comments

Comments

@Byte-Lab
Copy link
Collaborator

Byte-Lab commented Mar 18, 2024

As described in sched-ext/scx#192 (comment), @SoulHarsh007 managed to trigger a warning in the kernel when testing scx_lavd. In his words:

👋 I am not sure if this is the right place to report this, but I find the following kernel oops when using scx_lavd: https://paste.soulharsh007.dev/p/43a4476.log

Kernel Version: 6.8.1-2-cachyos

Let's track fixing it here.

@Byte-Lab
Copy link
Collaborator Author

This is where the warning is being triggered: https://github.com/sched-ext/sched_ext/blob/sched_ext/kernel/sched/ext.c#L2714-L2726. So for some reason we seem to have already started to try to move that task between cgroups.

@htejun something I'm not quite understanding from reading the code. In scx_cgroup_can_attach() we clean up and drop the scx_cgroup_rwsem here: https://github.com/sched-ext/sched_ext/blob/sched_ext/kernel/sched/ext.c#L2730-L2741, but then we clean up and drop the lock again in scx_cgroup_cancel_attach(): https://github.com/sched-ext/sched_ext/blob/sched_ext/kernel/sched/ext.c#L2784-L2793. From looking at cgroup_migrate_execute(), it looks like we we'd hit the cancel path and double free the lock. What am I missing?

@Byte-Lab
Copy link
Collaborator Author

Ah, nevermind, we bail out when we get to the subsys where the failure happened before we call into the cancel callback.

@SoulHarsh007
Copy link

👋 After trying to find a way to repro this issue consistently, I found:

  • The oops occurs only once per boot.
  • Mostly happens when I start multiple processes at the same time, for example:
    • Starting a systemd-nspawn container with tasks like system updates.
    • Compiling packages like mesa or the Linux kernel.

@htejun
Copy link
Collaborator

htejun commented Mar 18, 2024

I tried to reproduce the issue locally but haven't been successful. I'm traveling this week so my testing is rather limited. @SoulHarsh007, any chance you can apply the following patch to the kernel and see whether either of the printks trigger? Please note that the dump is a WARN_ON_ONCE triggering and it could be that the resolution is just removing the warning as it can trigger spuriously. ie. it may be harmless.

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6da5eb1e0fd36..e41cfb5a3ec11 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10574,8 +10574,12 @@ void sched_move_task(struct task_struct *tsk)
 	 * group changes.
 	 */
 	group = sched_get_task_group(tsk);
-	if (group == tsk->sched_task_group)
+	if (group == tsk->sched_task_group && tsk->scx.cgrp_moving_from) {
+		printk_deferred("SCXXX: sched_move_task: %s[%d] src==dst==%s but cgrp_moving_from is set\n",
+				tsk->comm, tsk->pid,
+				group->css.cgroup ? group->css.cgroup->kn->name : "<N/A>");
 		return;
+	}
 
 	update_rq_clock(rq);
 
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index d34271845f493..ee719f561a2e0 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -2752,8 +2752,12 @@ void scx_move_task(struct task_struct *p)
 	 * cgroup_move(), because cgroup migrations never happen for PF_EXITING
 	 * tasks.
 	 */
-	if (p->flags & PF_EXITING || task_group_is_autogroup(task_group(p)))
+	if (p->flags & PF_EXITING || task_group_is_autogroup(task_group(p))) {
+		if (p->scx.cgrp_moving_from)
+			printk_deferred("SCXXX: scx_move_task: %s[%d] scx_move_task w/ from set but flags=0x%x autogroup=%d\n",
+					p->comm, p->pid, p->flags, task_group_is_autogroup(task_group(p)));
 		return;
+	}
 
 	if (!scx_enabled())
 		return;

@SoulHarsh007
Copy link

SoulHarsh007 commented Mar 18, 2024

👋 log after this patch: https://paste.soulharsh007.dev/p/6a2065a.log

Edit: after exiting the container (systemd-nspawn), these messages are printed: https://paste.soulharsh007.dev/p/b580443.log

@htejun
Copy link
Collaborator

htejun commented Mar 19, 2024

Ah, thanks, yeah, that makes sense. Let me study the code further to understand why that's happening but the warning is triggering spuriously and the resolution most likely is just removing it.

@htejun
Copy link
Collaborator

htejun commented Mar 19, 2024

Can you please test whether this patch resolves the issue?

Thanks.

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index cf4f68bf191fb..ca44bee6a1b37 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -2713,6 +2713,17 @@ int scx_cgroup_can_attach(struct cgroup_taskset *tset)
 
 	cgroup_taskset_for_each(p, css, tset) {
 		struct cgroup *from = tg_cgrp(task_group(p));
+		struct cgroup *to = tg_cgrp(css_tg(css));
+
+		WARN_ON_ONCE(p->scx.cgrp_moving_from);
+
+		/*
+		 * sched_move_task() omits identity migrations. Let's match the
+		 * behavior so that ops.cgroup_prep_move() and ops.cgroup_move()
+		 * always match one-to-one.
+		 */
+		if (from == to)
+			continue;
 
 		if (SCX_HAS_OP(cgroup_prep_move)) {
 			ret = SCX_CALL_OP_RET(SCX_KF_SLEEPABLE, cgroup_prep_move,
@@ -2721,7 +2732,6 @@ int scx_cgroup_can_attach(struct cgroup_taskset *tset)
 				goto err;
 		}
 
-		WARN_ON_ONCE(p->scx.cgrp_moving_from);
 		p->scx.cgrp_moving_from = from;
 	}
 
@@ -2729,9 +2739,7 @@ int scx_cgroup_can_attach(struct cgroup_taskset *tset)
 
 err:
 	cgroup_taskset_for_each(p, css, tset) {
-		if (!p->scx.cgrp_moving_from)
-			break;
-		if (SCX_HAS_OP(cgroup_cancel_move))
+		if (SCX_HAS_OP(cgroup_cancel_move) && p->scx.cgrp_moving_from)
 			SCX_CALL_OP(SCX_KF_SLEEPABLE, cgroup_cancel_move, p,
 				    p->scx.cgrp_moving_from, css->cgroup);
 		p->scx.cgrp_moving_from = NULL;
@@ -2759,12 +2767,13 @@ void scx_move_task(struct task_struct *p)
 	if (!scx_enabled())
 		return;
 
-	if (SCX_HAS_OP(cgroup_move)) {
-		if (WARN_ON_ONCE(!p->scx.cgrp_moving_from))
-			return;
+	/*
+	 * @p must have ops.cgroup_prep_move() called on it and thus
+	 * cgrp_moving_from set.
+	 */
+	if (SCX_HAS_OP(cgroup_move) && !WARN_ON_ONCE(!p->scx.cgrp_moving_from))
 		SCX_CALL_OP_TASK(SCX_KF_UNLOCKED, cgroup_move, p,
 			p->scx.cgrp_moving_from, tg_cgrp(task_group(p)));
-	}
 	p->scx.cgrp_moving_from = NULL;
 }
 
@@ -2782,11 +2791,9 @@ void scx_cgroup_cancel_attach(struct cgroup_taskset *tset)
 		goto out_unlock;
 
 	cgroup_taskset_for_each(p, css, tset) {
-		if (SCX_HAS_OP(cgroup_cancel_move)) {
-			WARN_ON_ONCE(!p->scx.cgrp_moving_from);
+		if (SCX_HAS_OP(cgroup_cancel_move) && p->scx.cgrp_moving_from)
 			SCX_CALL_OP(SCX_KF_SLEEPABLE, cgroup_cancel_move, p,
 				    p->scx.cgrp_moving_from, css->cgroup);
-		}
 		p->scx.cgrp_moving_from = NULL;
 	}
 out_unlock:

@SoulHarsh007
Copy link

👋 I applied #165 (thanks to @sirlucjan for helping me with the patches) and the error does not seem to appear anymore. I have tested it multiple times and can confirm the error message is now gone!

sirlucjan added a commit to CachyOS/kernel-patches that referenced this issue Mar 19, 2024
Fix: sched-ext/sched_ext#164
Signed-off-by: Piotr Gorski <lucjan.lucjanov@gmail.com>
@htejun
Copy link
Collaborator

htejun commented Mar 19, 2024

Thank you so much for confirming the fix. Also, thanks @sirlucjan for the help.

@htejun
Copy link
Collaborator

htejun commented Mar 19, 2024

Fix is now available in https://github.com/sched-ext/scx-kernel-releases/releases/tag/v6.8.0-scx2.

@htejun htejun closed this as completed Mar 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants