Skip to content

Commit

Permalink
profiling: remove profile=sleep support
Browse files Browse the repository at this point in the history
The kernel sleep profile is no longer working due to a recursive locking
bug introduced by commit 42a20f8 ("sched: Add wrapper for get_wchan()
to keep task blocked"); either booting with

  profile=sleep

kernel command line option added or executing

  # echo -n sleep > /sys/kernel/profiling

after boot causes the system to lock up.

  ============================================
  WARNING: possible recursive locking detected
  5.15.0-rc4+ torvalds#104 Not tainted
  --------------------------------------------
  kthreadd/3 is trying to acquire lock:
  ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x70

  but task is already holding lock:
  ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370

  other info that might help us debug this:
   Possible unsafe locking scenario:

         CPU0
         ----
    lock(&p->pi_lock);
    lock(&p->pi_lock);

   *** DEADLOCK ***

   May be due to missing lock nesting notation

  3 locks held by kthreadd/3:
   #0: ffffae5ac2833d68 (&x->wait){....}-{2:2}, at: complete+0x18/0x40
   #1: ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370
   #2: ffff93ad29fe9458 (&rq->__lock){-...}-{2:2}, at: try_to_wake_up+0x19f/0x370

  stack backtrace:
  CPU: 0 PID: 3 Comm: kthreadd Not tainted 5.15.0-rc4+ torvalds#104
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
  Call Trace:
   dump_stack_lvl+0x57/0x72
   validate_chain.cold+0x122/0x127
   __lock_acquire+0x4d2/0x930
   lock_acquire+0xc8/0x2f0
   ? get_wchan+0x32/0x70
   ? rcu_read_lock_sched_held+0x3f/0x80
   _raw_spin_lock_irq+0x47/0x90
   ? get_wchan+0x32/0x70
   get_wchan+0x32/0x70
   __update_stats_enqueue_sleeper+0x151/0x430
   enqueue_entity+0x4b0/0x520
   enqueue_task_fair+0x92/0x6b0
   ? lock_is_held_type+0xa7/0x120
   ttwu_do_activate+0x73/0x140
   try_to_wake_up+0x213/0x370
   swake_up_locked+0x20/0x50
   complete+0x2f/0x40
   ? __cancel_work_timer+0x1b0/0x1b0
   kthread+0xfb/0x180
   ? set_kthread_struct+0x40/0x40
   ret_from_fork+0x22/0x30

However, since nobody noticed this regression for more than 2 years, let's
remove profile=sleep support based on an assumption that nobody needs this
functionality.

Fixes: 42a20f8 ("sched: Add wrapper for get_wchan() to keep task blocked")
Cc: stable@vger.kernel.org # v5.16+
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
  • Loading branch information
Tetsuo Handa authored and intel-lab-lkp committed Aug 4, 2024
1 parent cea5a34 commit 4a54387
Show file tree
Hide file tree
Showing 4 changed files with 2 additions and 24 deletions.
4 changes: 1 addition & 3 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4798,11 +4798,9 @@

profile= [KNL] Enable kernel profiling via /proc/profile
Format: [<profiletype>,]<number>
Param: <profiletype>: "schedule", "sleep", or "kvm"
Param: <profiletype>: "schedule" or "kvm"
[defaults to kernel profiling]
Param: "schedule" - profile schedule points.
Param: "sleep" - profile D-state sleeping (millisecs).
Requires CONFIG_SCHEDSTATS
Param: "kvm" - profile VM exits.
Param: <number> - step/bucket size as a power of 2 for
statistical time based profiling.
Expand Down
1 change: 0 additions & 1 deletion include/linux/profile.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@

#define CPU_PROFILING 1
#define SCHED_PROFILING 2
#define SLEEP_PROFILING 3
#define KVM_PROFILING 4

struct proc_dir_entry;
Expand Down
11 changes: 1 addition & 10 deletions kernel/profile.c
Original file line number Diff line number Diff line change
Expand Up @@ -57,20 +57,11 @@ static DEFINE_MUTEX(profile_flip_mutex);
int profile_setup(char *str)
{
static const char schedstr[] = "schedule";
static const char sleepstr[] = "sleep";
static const char kvmstr[] = "kvm";
const char *select = NULL;
int par;

if (!strncmp(str, sleepstr, strlen(sleepstr))) {
#ifdef CONFIG_SCHEDSTATS
force_schedstat_enabled();
prof_on = SLEEP_PROFILING;
select = sleepstr;
#else
pr_warn("kernel sleep profiling requires CONFIG_SCHEDSTATS\n");
#endif /* CONFIG_SCHEDSTATS */
} else if (!strncmp(str, schedstr, strlen(schedstr))) {
if (!strncmp(str, schedstr, strlen(schedstr))) {
prof_on = SCHED_PROFILING;
select = schedstr;
} else if (!strncmp(str, kvmstr, strlen(kvmstr))) {
Expand Down
10 changes: 0 additions & 10 deletions kernel/sched/stats.c
Original file line number Diff line number Diff line change
Expand Up @@ -92,16 +92,6 @@ void __update_stats_enqueue_sleeper(struct rq *rq, struct task_struct *p,

trace_sched_stat_blocked(p, delta);

/*
* Blocking time is in units of nanosecs, so shift by
* 20 to get a milliseconds-range estimation of the
* amount of time that the task spent sleeping:
*/
if (unlikely(prof_on == SLEEP_PROFILING)) {
profile_hits(SLEEP_PROFILING,
(void *)get_wchan(p),
delta >> 20);
}
account_scheduler_latency(p, delta >> 10, 0);
}
}
Expand Down

0 comments on commit 4a54387

Please sign in to comment.