Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug
On a KVM guest, when a CPU is taken offline and brought back online, we hit the following NULL pointer dereference: [ 45.400843] Unregister pv shared memory for cpu 1 [ 45.412331] smpboot: CPU 1 is now offline [ 45.529894] SMP alternatives: lockdep: fixing up alternatives [ 45.533472] smpboot: Booting Node 0 Processor 1 APIC 0x1 [ 45.411526] kvm-clock: cpu 1, msr 0:7d14601, secondary cpu clock [ 45.571370] KVM setup async PF for cpu 1 [ 45.572331] kvm-stealtime: cpu 1, msr 7d0e040 [ 45.575031] BUG: unable to handle kernel NULL pointer dereference at (null) [ 45.576017] IP: [<ffffffff81519f98>] cpuidle_disable_device+0x18/0x80 [ 45.576017] PGD 5dfb067 PUD 5da8067 PMD 0 [ 45.576017] Oops: 0000 [#1] SMP [ 45.576017] Modules linked in: [ 45.576017] CPU 0 [ 45.576017] Pid: 607, comm: stress_cpu_hotp Not tainted 3.6.0-padata-tp-debug #3 Bochs Bochs [ 45.576017] RIP: 0010:[<ffffffff81519f98>] [<ffffffff81519f98>] cpuidle_disable_device+0x18/0x80 [ 45.576017] RSP: 0018:ffff880005d93ce8 EFLAGS: 00010286 [ 45.576017] RAX: ffff880005d93fd8 RBX: 0000000000000000 RCX: 0000000000000006 [ 45.576017] RDX: 0000000000000006 RSI: 2222222222222222 RDI: 0000000000000000 [ 45.576017] RBP: ffff880005d93cf8 R08: 2222222222222222 R09: 2222222222222222 [ 45.576017] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 45.576017] R13: 0000000000000000 R14: ffffffff81c8cca0 R15: 0000000000000001 [ 45.576017] FS: 00007f91936ae700(0000) GS:ffff880007c00000(0000) knlGS:0000000000000000 [ 45.576017] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 45.576017] CR2: 0000000000000000 CR3: 0000000005db3000 CR4: 00000000000006f0 [ 45.576017] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 45.576017] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 45.576017] Process stress_cpu_hotp (pid: 607, threadinfo ffff880005d92000, task ffff8800066bbf40) [ 45.576017] Stack: [ 45.576017] ffff880007a96400 0000000000000000 ffff880005d93d28 ffffffff813ac689 [ 45.576017] ffff880007a96400 ffff880007a96400 0000000000000002 ffffffff81cd8d01 [ 45.576017] ffff880005d93d58 ffffffff813aa498 0000000000000001 00000000ffffffdd [ 45.576017] Call Trace: [ 45.576017] [<ffffffff813ac689>] acpi_processor_hotplug+0x55/0x97 [ 45.576017] [<ffffffff813aa498>] acpi_cpu_soft_notify+0x93/0xce [ 45.576017] [<ffffffff816ae47d>] notifier_call_chain+0x5d/0x110 [ 45.576017] [<ffffffff8109730e>] __raw_notifier_call_chain+0xe/0x10 [ 45.576017] [<ffffffff81069050>] __cpu_notify+0x20/0x40 [ 45.576017] [<ffffffff81069085>] cpu_notify+0x15/0x20 [ 45.576017] [<ffffffff816978f1>] _cpu_up+0xee/0x137 [ 45.576017] [<ffffffff81697983>] cpu_up+0x49/0x59 [ 45.576017] [<ffffffff8168758d>] store_online+0x9d/0xe0 [ 45.576017] [<ffffffff8140a9f8>] dev_attr_store+0x18/0x30 [ 45.576017] [<ffffffff812322c0>] sysfs_write_file+0xe0/0x150 [ 45.576017] [<ffffffff811b389c>] vfs_write+0xac/0x180 [ 45.576017] [<ffffffff811b3be2>] sys_write+0x52/0xa0 [ 45.576017] [<ffffffff816b31e9>] system_call_fastpath+0x16/0x1b [ 45.576017] Code: 48 c7 c7 40 e5 ca 81 e8 07 d0 18 00 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 10 48 89 5d f0 4c 89 65 f8 48 89 fb <f6> 07 02 75 13 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 0f 1f 84 00 00 [ 45.576017] RIP [<ffffffff81519f98>] cpuidle_disable_device+0x18/0x80 [ 45.576017] RSP <ffff880005d93ce8> [ 45.576017] CR2: 0000000000000000 [ 45.656079] ---[ end trace 433d6c9ac0b02cef ]--- Analysis: Commit 3d339dc (cpuidle / ACPI : move cpuidle_device field out of the acpi_processor_power structure()) made the allocation of the dev structure (struct cpuidle) of a CPU dynamic, whereas previously it was statically allocated. And this dynamic allocation occurs in acpi_processor_power_init() if pr->flags.power evaluates to non-zero. On KVM guests, pr->flags.power evaluates to zero, hence dev is never allocated. This causes the NULL pointer (dev) dereference in cpuidle_disable_device() during a subsequent CPU online operation. Fix this by ensuring that dev is non-NULL before dereferencing. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Len Brown <len.brown@intel.com>
- Loading branch information