Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync up with linux #137

Closed
wants to merge 2 commits into from
Closed

sync up with linux #137

wants to merge 2 commits into from

Conversation

dabrace
Copy link

@dabrace dabrace commented Nov 18, 2014

No description provided.

@dabrace dabrace closed this Nov 18, 2014
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Oct 28, 2015
Currently it's assumed that firmware exports only the class of sensors
supported by the driver. However with newer firmware or SCPI protocol
revision, support for newer classes of sensors can be present.

The driver fails to probe with the following warning if an unsupported
class of sensor is encountered in the firmware.

sysfs: cannot create duplicate filename
	'/devices/platform/scpi/scpi:sensors/hwmon/hwmon0/'
------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:31
Modules linked in:

CPU: 0 PID: 6 Comm: kworker/u12:0 Not tainted 4.3.0-rc7 torvalds#137
Hardware name: ARM Juno development board (r0) (DT)
Workqueue: deferwq deferred_probe_work_func
PC is at sysfs_warn_dup+0x54/0x78
LR is at sysfs_warn_dup+0x54/0x78

This patch fixes the above issue by skipping through the unsupported
class of SCPI sensors.

Fixes: 68acc77 ("hwmon: Support thermal zones registration for SCP temperature sensors")
Fixes: ea98b29 ("hwmon: Support sensors exported via ARM SCP interface")
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Oct 28, 2015
Currently it's assumed that firmware exports only the class of sensors
supported by the driver. However with newer firmware or SCPI protocol
revision, support for newer classes of sensors can be present.

The driver fails to probe with the following warning if an unsupported
class of sensor is encountered in the firmware.

sysfs: cannot create duplicate filename
	'/devices/platform/scpi/scpi:sensors/hwmon/hwmon0/'
------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:31
Modules linked in:

CPU: 0 PID: 6 Comm: kworker/u12:0 Not tainted 4.3.0-rc7 torvalds#137
Hardware name: ARM Juno development board (r0) (DT)
Workqueue: deferwq deferred_probe_work_func
PC is at sysfs_warn_dup+0x54/0x78
LR is at sysfs_warn_dup+0x54/0x78

This patch fixes the above issue by skipping through the unsupported
class of SCPI sensors.

Fixes: 68acc77 ("hwmon: Support thermal zones registration for SCP temperature sensors")
Fixes: ea98b29 ("hwmon: Support sensors exported via ARM SCP interface")
Cc: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Nov 16, 2015
OMAP CPU hotplug uses cpu1's clocks and power domains for CPU1 wake up
from low power states (or turn on CPU1). This part of code is also
part of system suspend (disable_nonboot_cpus()).
From other side, cpu1's clocks and power domains are used by CPUIdle. All above
functionality is mutually exclusive and, therefore, lockless clkdm/pwrdm api
can be used in omap4_boot_secondary().

This fixes below back-trace on -RT which is triggered by
pwrdm_lock/unlock():

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
 in_atomic(): 1, irqs_disabled(): 0, pid: 118, name: sh
 9 locks held by sh/118:
  #0:  (sb_writers#4){.+.+.+}, at: [<c0144a6c>] vfs_write+0x13c/0x164
  #1:  (&of->mutex){+.+.+.}, at: [<c01b4c70>] kernfs_fop_write+0x48/0x19c
  #2:  (s_active#24){.+.+.+}, at: [<c01b4c78>] kernfs_fop_write+0x50/0x19c
  #3:  (device_hotplug_lock){+.+.+.}, at: [<c03cbff0>] lock_device_hotplug_sysfs+0xc/0x4c
  #4:  (&dev->mutex){......}, at: [<c03cd284>] device_online+0x14/0x88
  #5:  (cpu_add_remove_lock){+.+.+.}, at: [<c003af90>] cpu_up+0x50/0x1a0
  torvalds#6:  (cpu_hotplug.lock){++++++}, at: [<c003ae48>] cpu_hotplug_begin+0x0/0xc4
  torvalds#7:  (cpu_hotplug.lock#2){+.+.+.}, at: [<c003aec0>] cpu_hotplug_begin+0x78/0xc4
  torvalds#8:  (boot_lock){+.+...}, at: [<c002b254>] omap4_boot_secondary+0x1c/0x178
 Preemption disabled at:[<  (null)>]   (null)

 CPU: 0 PID: 118 Comm: sh Not tainted 4.1.12-rt11-01998-gb4a62c3-dirty torvalds#137
 Hardware name: Generic DRA74X (Flattened Device Tree)
 [<c0017574>] (unwind_backtrace) from [<c0013be8>] (show_stack+0x10/0x14)
 [<c0013be8>] (show_stack) from [<c05a8670>] (dump_stack+0x80/0x94)
 [<c05a8670>] (dump_stack) from [<c05ad158>] (rt_spin_lock+0x24/0x54)
 [<c05ad158>] (rt_spin_lock) from [<c0030dac>] (clkdm_wakeup+0x10/0x2c)
 [<c0030dac>] (clkdm_wakeup) from [<c002b2c0>] (omap4_boot_secondary+0x88/0x178)
 [<c002b2c0>] (omap4_boot_secondary) from [<c0015d00>] (__cpu_up+0xc4/0x164)
 [<c0015d00>] (__cpu_up) from [<c003b09c>] (cpu_up+0x15c/0x1a0)
 [<c003b09c>] (cpu_up) from [<c03cd2d4>] (device_online+0x64/0x88)
 [<c03cd2d4>] (device_online) from [<c03cd360>] (online_store+0x68/0x74)
 [<c03cd360>] (online_store) from [<c01b4ce0>] (kernfs_fop_write+0xb8/0x19c)
 [<c01b4ce0>] (kernfs_fop_write) from [<c0144124>] (__vfs_write+0x20/0xd8)
 [<c0144124>] (__vfs_write) from [<c01449c0>] (vfs_write+0x90/0x164)
 [<c01449c0>] (vfs_write) from [<c01451e4>] (SyS_write+0x44/0x9c)
 [<c01451e4>] (SyS_write) from [<c0010240>] (ret_fast_syscall+0x0/0x54)
 CPU1: smp_ops.cpu_die() returned, trying to resuscitate

Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Nov 18, 2015
Currently it's assumed that firmware exports only the class of sensors
supported by the driver. However with newer firmware or SCPI protocol
revision, support for newer classes of sensors can be present.

The driver fails to probe with the following warning if an unsupported
class of sensor is encountered in the firmware.

sysfs: cannot create duplicate filename
	'/devices/platform/scpi/scpi:sensors/hwmon/hwmon0/'
------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:31
Modules linked in:

CPU: 0 PID: 6 Comm: kworker/u12:0 Not tainted 4.3.0-rc7 torvalds#137
Hardware name: ARM Juno development board (r0) (DT)
Workqueue: deferwq deferred_probe_work_func
PC is at sysfs_warn_dup+0x54/0x78
LR is at sysfs_warn_dup+0x54/0x78

This patch fixes the above issue by skipping through the unsupported
class of SCPI sensors.

Fixes: 68acc77 ("hwmon: Support thermal zones registration for SCP temperature sensors")
Fixes: ea98b29 ("hwmon: Support sensors exported via ARM SCP interface")
Cc: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Nov 26, 2015
OMAP CPU hotplug uses cpu1's clocks and power domains for CPU1 wake up
from low power states (or turn on CPU1). This part of code is also
part of system suspend (disable_nonboot_cpus()).
>From other side, cpu1's clocks and power domains are used by CPUIdle. All above
functionality is mutually exclusive and, therefore, lockless clkdm/pwrdm api
can be used in omap4_boot_secondary().

This fixes below back-trace on -RT which is triggered by
pwrdm_lock/unlock():

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
 in_atomic(): 1, irqs_disabled(): 0, pid: 118, name: sh
 9 locks held by sh/118:
  #0:  (sb_writers#4){.+.+.+}, at: [<c0144a6c>] vfs_write+0x13c/0x164
  #1:  (&of->mutex){+.+.+.}, at: [<c01b4c70>] kernfs_fop_write+0x48/0x19c
  #2:  (s_active#24){.+.+.+}, at: [<c01b4c78>] kernfs_fop_write+0x50/0x19c
  #3:  (device_hotplug_lock){+.+.+.}, at: [<c03cbff0>] lock_device_hotplug_sysfs+0xc/0x4c
  #4:  (&dev->mutex){......}, at: [<c03cd284>] device_online+0x14/0x88
  #5:  (cpu_add_remove_lock){+.+.+.}, at: [<c003af90>] cpu_up+0x50/0x1a0
  torvalds#6:  (cpu_hotplug.lock){++++++}, at: [<c003ae48>] cpu_hotplug_begin+0x0/0xc4
  torvalds#7:  (cpu_hotplug.lock#2){+.+.+.}, at: [<c003aec0>] cpu_hotplug_begin+0x78/0xc4
  torvalds#8:  (boot_lock){+.+...}, at: [<c002b254>] omap4_boot_secondary+0x1c/0x178
 Preemption disabled at:[<  (null)>]   (null)

 CPU: 0 PID: 118 Comm: sh Not tainted 4.1.12-rt11-01998-gb4a62c3-dirty torvalds#137
 Hardware name: Generic DRA74X (Flattened Device Tree)
 [<c0017574>] (unwind_backtrace) from [<c0013be8>] (show_stack+0x10/0x14)
 [<c0013be8>] (show_stack) from [<c05a8670>] (dump_stack+0x80/0x94)
 [<c05a8670>] (dump_stack) from [<c05ad158>] (rt_spin_lock+0x24/0x54)
 [<c05ad158>] (rt_spin_lock) from [<c0030dac>] (clkdm_wakeup+0x10/0x2c)
 [<c0030dac>] (clkdm_wakeup) from [<c002b2c0>] (omap4_boot_secondary+0x88/0x178)
 [<c002b2c0>] (omap4_boot_secondary) from [<c0015d00>] (__cpu_up+0xc4/0x164)
 [<c0015d00>] (__cpu_up) from [<c003b09c>] (cpu_up+0x15c/0x1a0)
 [<c003b09c>] (cpu_up) from [<c03cd2d4>] (device_online+0x64/0x88)
 [<c03cd2d4>] (device_online) from [<c03cd360>] (online_store+0x68/0x74)
 [<c03cd360>] (online_store) from [<c01b4ce0>] (kernfs_fop_write+0xb8/0x19c)
 [<c01b4ce0>] (kernfs_fop_write) from [<c0144124>] (__vfs_write+0x20/0xd8)
 [<c0144124>] (__vfs_write) from [<c01449c0>] (vfs_write+0x90/0x164)
 [<c01449c0>] (vfs_write) from [<c01451e4>] (SyS_write+0x44/0x9c)
 [<c01451e4>] (SyS_write) from [<c0010240>] (ret_fast_syscall+0x0/0x54)
 CPU1: smp_ops.cpu_die() returned, trying to resuscitate

Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Dec 16, 2015
Pavel Machek <pavel@ucw.cz> writes:
> Hi!
>
>> > or similar?
>> >
>> > The above is entirely untested. Maybe it doesn't compile. Or
>> > boot. Or work.
>>
>> Well, with two extra spaces at each line, it does not apply :-).
>>
>> I applied it by hand, and the output is:
>>
>> [    0.000000] MTRR variable ranges enabled:
> ...> [    0.000000] BRK [0x0566c000, 0x0566cfff] PGTABLE
>>
>> I'll take a look if I can figure out what it means...
>
> Wait, there's more in the log.
>
> [    1.952146] Bluetooth: HCI UART protocol H4 registered
> [    1.954335] Bluetooth: HCI UART protocol BCSP registered
> [    1.956750] usbcore: registered new interface driver btusb
> [    1.958953] ------------[ cut here ]------------
> [    1.961149] WARNING: CPU: 1 PID: 1 at
> ./arch/x86/include/asm/pgtable.h:357
> vmap_page_range_noflush+0x1f0/0x280()
> [    1.963511] Modules linked in:
> [    1.965849] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W
> 4.4.0-rc5+ torvalds#137
> [    1.968230] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW
> (2.19 ) 03/31/2011
> [    1.970593]  00000001 00000000 f5cffe64 c42baaf8 00000000 f5cffe80
> c404066b 00000165
> [    1.973103]  c40fbe70 00000163 00000000 00000000 f5cffe90 c404070f
> 00000009 00000000
> [    1.975670]  f5cffee0 c40fbe70 c4f88348 00000000 ffe6dfff ffe6e000
> c4f8a018 ffe6dfff
> [    1.978304] Call Trace:
> [    1.980882]  [<c42baaf8>] dump_stack+0x41/0x59
> [    1.983464]  [<c404066b>] warn_slowpath_common+0x6b/0xa0
> [    1.986053]  [<c40fbe70>] ? vmap_page_range_noflush+0x1f0/0x280
> [    1.988625]  [<c404070f>] warn_slowpath_null+0xf/0x20
> [    1.991154]  [<c40fbe70>] vmap_page_range_noflush+0x1f0/0x280
> [    1.993676]  [<c40fbf2b>] map_vm_area+0x2b/0x40
> [    1.996153]  [<c4f2c795>] init+0xf8/0x1a4
> [    1.998591]  [<c4f2c69d>] ? edac_init+0x67/0x67
> [    2.001014]  [<c4000442>] do_one_initcall+0xc2/0x1c0
> [    2.003391]  [<c4f044e3>] ? initcall_blacklist+0x97/0x97
> [    2.005815]  [<c4f044e3>] ? initcall_blacklist+0x97/0x97
> [    2.008161]  [<c4051546>] ?
> __usermodehelper_set_disable_depth+0x36/0x40
> [    2.010518]  [<c407d4a6>] ? up_write+0x16/0x40
> [    2.012817]  [<c4f04ba3>] kernel_init_freeable+0xf0/0x16d
> [    2.015078]  [<c4f04ba3>] ? kernel_init_freeable+0xf0/0x16d
> [    2.017386]  [<c4a4d9c8>] kernel_init+0x8/0xc0
> [    2.019661]  [<c4a54149>] ret_from_kernel_thread+0x21/0x38
> [    2.021932]  [<c4a4d9c0>] ? rest_init+0xa0/0xa0
> [    2.024168] ---[ end trace e117245cd61feaf2 ]---
> [    2.026383] lguest: mapped switcher at ffe69000
> [    2.028958] sdhci: Secure Digital Host Controller Interface driver
>
> ...which I don't understand; did not we say warn on _once_?
> ... Um. But I think we have a winner: "lguest: mapped switcher at
> ffe69000".
>
> Rusty, does the switcher need to be W+X?
>
> And yes, I have lguest enabled, not sure why.

No.  The layout is "<text page> <per-cpu-stack-pages>..." and I lazily
did that as a single
        map_vm_area(switcher_vma, PAGE_KERNEL_EXEC, lg_switcher_pages);

This boots, does it solve the problem?

Thanks!
Rusty.

From: Rusty Russell <rusty@rustcorp.com.au>
Subject: lguest: map switcher text R/O.

Pavel noted that lguest maps the switcher code executable and
read-write.  This is a bad idea for any kernel text, but particularly
for text mapped at a fixed address.

Create two vmas, one for the text (PAGE_KERNEL_RX) and another for the
stacks (PAGE_KERNEL).  Use VM_NO_GUARD to map them adjacent (as
expected by the rest of the code).

Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rogerq pushed a commit to rogerq/linux that referenced this pull request Mar 8, 2016
commit 918af9f  upstream

OMAP CPU hotplug uses cpu1's clocks and power domains for CPU1 wake up
from low power states (or turn on CPU1). This part of code is also
part of system suspend (disable_nonboot_cpus()).
From other side, cpu1's clocks and power domains are used by CPUIdle. All above
functionality is mutually exclusive and, therefore, lockless clkdm/pwrdm api
can be used in omap4_boot_secondary().

This fixes below back-trace on -RT which is triggered by
pwrdm_lock/unlock():

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
 in_atomic(): 1, irqs_disabled(): 0, pid: 118, name: sh
 9 locks held by sh/118:
  #0:  (sb_writers#4){.+.+.+}, at: [<c0144a6c>] vfs_write+0x13c/0x164
  #1:  (&of->mutex){+.+.+.}, at: [<c01b4c70>] kernfs_fop_write+0x48/0x19c
  #2:  (s_active#24){.+.+.+}, at: [<c01b4c78>] kernfs_fop_write+0x50/0x19c
  #3:  (device_hotplug_lock){+.+.+.}, at: [<c03cbff0>] lock_device_hotplug_sysfs+0xc/0x4c
  #4:  (&dev->mutex){......}, at: [<c03cd284>] device_online+0x14/0x88
  #5:  (cpu_add_remove_lock){+.+.+.}, at: [<c003af90>] cpu_up+0x50/0x1a0
  torvalds#6:  (cpu_hotplug.lock){++++++}, at: [<c003ae48>] cpu_hotplug_begin+0x0/0xc4
  torvalds#7:  (cpu_hotplug.lock#2){+.+.+.}, at: [<c003aec0>] cpu_hotplug_begin+0x78/0xc4
  torvalds#8:  (boot_lock){+.+...}, at: [<c002b254>] omap4_boot_secondary+0x1c/0x178
 Preemption disabled at:[<  (null)>]   (null)

 CPU: 0 PID: 118 Comm: sh Not tainted 4.1.12-rt11-01998-gb4a62c3-dirty torvalds#137
 Hardware name: Generic DRA74X (Flattened Device Tree)
 [<c0017574>] (unwind_backtrace) from [<c0013be8>] (show_stack+0x10/0x14)
 [<c0013be8>] (show_stack) from [<c05a8670>] (dump_stack+0x80/0x94)
 [<c05a8670>] (dump_stack) from [<c05ad158>] (rt_spin_lock+0x24/0x54)
 [<c05ad158>] (rt_spin_lock) from [<c0030dac>] (clkdm_wakeup+0x10/0x2c)
 [<c0030dac>] (clkdm_wakeup) from [<c002b2c0>] (omap4_boot_secondary+0x88/0x178)
 [<c002b2c0>] (omap4_boot_secondary) from [<c0015d00>] (__cpu_up+0xc4/0x164)
 [<c0015d00>] (__cpu_up) from [<c003b09c>] (cpu_up+0x15c/0x1a0)
 [<c003b09c>] (cpu_up) from [<c03cd2d4>] (device_online+0x64/0x88)
 [<c03cd2d4>] (device_online) from [<c03cd360>] (online_store+0x68/0x74)
 [<c03cd360>] (online_store) from [<c01b4ce0>] (kernfs_fop_write+0xb8/0x19c)
 [<c01b4ce0>] (kernfs_fop_write) from [<c0144124>] (__vfs_write+0x20/0xd8)
 [<c0144124>] (__vfs_write) from [<c01449c0>] (vfs_write+0x90/0x164)
 [<c01449c0>] (vfs_write) from [<c01451e4>] (SyS_write+0x44/0x9c)
 [<c01451e4>] (SyS_write) from [<c0010240>] (ret_fast_syscall+0x0/0x54)
 CPU1: smp_ops.cpu_die() returned, trying to resuscitate

Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request May 27, 2016
WARNING: line over 80 characters
torvalds#128: FILE: mm/cma.c:186:
+	alignment = PAGE_SIZE << max_t(unsigned long, MAX_ORDER - 1, pageblock_order);

WARNING: line over 80 characters
torvalds#137: FILE: mm/cma.c:270:
+		(phys_addr_t)PAGE_SIZE << max_t(unsigned long, MAX_ORDER - 1, pageblock_order));

total: 0 errors, 2 warnings, 16 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/mm-cma-silence-warnings-due-to-max-usage.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Dec 4, 2017
In tipc_topsrv_kern_subscr() when s->tipc_conn_new() fails
we call tipc_close_conn() to clean up, but in this case
con->usr_data is NULL, tipc_subscrb_delete() should be skipped.

This fixes the folllowing crash:

 kasan: GPF could be caused by NULL-ptr deref or user memory access
 general protection fault: 0000 [#1] SMP KASAN
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Modules linked in:
 CPU: 0 PID: 3085 Comm: syzkaller064164 Not tainted 4.15.0-rc1+ torvalds#137
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
 task: 00000000c24413a5 task.stack: 000000005e8160b5
 RIP: 0010:__lock_acquire+0xd55/0x47f0 kernel/locking/lockdep.c:3378
 RSP: 0018:ffff8801cb5474a8 EFLAGS: 00010002
 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff85ecb400
 RBP: ffff8801cb547830 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000000 R11: ffffffff87489d60 R12: ffff8801cd2980c0
 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000020
 FS:  00000000014ee880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007ffee2426e40 CR3: 00000001cb85a000 CR4: 00000000001406f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
  __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
  _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:175
  spin_lock_bh include/linux/spinlock.h:320 [inline]
  tipc_subscrb_subscrp_delete+0x8f/0x470 net/tipc/subscr.c:201
  tipc_subscrb_delete net/tipc/subscr.c:238 [inline]
  tipc_subscrb_release_cb+0x17/0x30 net/tipc/subscr.c:316
  tipc_close_conn+0x171/0x270 net/tipc/server.c:204
  tipc_topsrv_kern_subscr+0x724/0x810 net/tipc/server.c:514
  tipc_group_create+0x702/0x9c0 net/tipc/group.c:184
  tipc_sk_join net/tipc/socket.c:2747 [inline]
  tipc_setsockopt+0x249/0xc10 net/tipc/socket.c:2861
  SYSC_setsockopt net/socket.c:1851 [inline]
  SyS_setsockopt+0x189/0x360 net/socket.c:1830
  entry_SYSCALL_64_fastpath+0x1f/0x96

Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Dec 7, 2017
In tipc_topsrv_kern_subscr() when s->tipc_conn_new() fails
we call tipc_close_conn() to clean up, but in this case
calling conn_put() is just enough.

This fixes the folllowing crash:

 kasan: GPF could be caused by NULL-ptr deref or user memory access
 general protection fault: 0000 [#1] SMP KASAN
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Modules linked in:
 CPU: 0 PID: 3085 Comm: syzkaller064164 Not tainted 4.15.0-rc1+ torvalds#137
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
 task: 00000000c24413a5 task.stack: 000000005e8160b5
 RIP: 0010:__lock_acquire+0xd55/0x47f0 kernel/locking/lockdep.c:3378
 RSP: 0018:ffff8801cb5474a8 EFLAGS: 00010002
 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff85ecb400
 RBP: ffff8801cb547830 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000000 R11: ffffffff87489d60 R12: ffff8801cd2980c0
 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000020
 FS:  00000000014ee880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007ffee2426e40 CR3: 00000001cb85a000 CR4: 00000000001406f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
  __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
  _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:175
  spin_lock_bh include/linux/spinlock.h:320 [inline]
  tipc_subscrb_subscrp_delete+0x8f/0x470 net/tipc/subscr.c:201
  tipc_subscrb_delete net/tipc/subscr.c:238 [inline]
  tipc_subscrb_release_cb+0x17/0x30 net/tipc/subscr.c:316
  tipc_close_conn+0x171/0x270 net/tipc/server.c:204
  tipc_topsrv_kern_subscr+0x724/0x810 net/tipc/server.c:514
  tipc_group_create+0x702/0x9c0 net/tipc/group.c:184
  tipc_sk_join net/tipc/socket.c:2747 [inline]
  tipc_setsockopt+0x249/0xc10 net/tipc/socket.c:2861
  SYSC_setsockopt net/socket.c:1851 [inline]
  SyS_setsockopt+0x189/0x360 net/socket.c:1830
  entry_SYSCALL_64_fastpath+0x1f/0x96

Fixes: 14c0449 ("tipc: add ability to order and receive topology events in driver")
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
iaguis pushed a commit to kinvolk/linux that referenced this pull request Feb 6, 2018
alaahl pushed a commit to alaahl/linux that referenced this pull request Mar 1, 2018
The kernel compiled with CONFIG_REFCOUNT_FULL produces the following
error. The reason to it that initial value of refcount_t is supposed
to be more than 0, change it.

[    3.106634] ------------[ cut here ]------------
[    3.107756] refcount_t: increment on 0; use-after-free.
[    3.109130] WARNING: CPU: 0 PID: 1 at lib/refcount.c:153 refcount_inc+0x27/0x30
[    3.110085] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1-00028-gf683e04bdccc torvalds#137
[    3.110085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[    3.110085] RIP: 0010:refcount_inc+0x27/0x30
[    3.110085] RSP: 0000:ffffaa620000fba0 EFLAGS: 00010286
[    3.110085] RAX: 0000000000000000 RBX: ffff9a6d1a1821c8 RCX: ffffffff98a50f48
[    3.110085] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000246
[    3.110085] RBP: ffff9a6d1ac800a0 R08: 0000000000000289 R09: 000000000000000a
[    3.110085] R10: fffff03bc0682840 R11: ffffffff9949856d R12: ffff9a6d1b4a4000
[    3.110085] R13: 0000000000000000 R14: ffff9a6d1a0a6c00 R15: ffffaa620000fc5c
[    3.110085] FS:  0000000000000000(0000) GS:ffff9a6d1fc00000(0000) knlGS:0000000000000000
[    3.110085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.110085] CR2: 0000000000000000 CR3: 000000000ba0a000 CR4: 00000000000006b0
[    3.110085] Call Trace:
[    3.110085]  mlx5_core_create_cq+0xde/0x250
[    3.110085]  ? __kmalloc+0x1ce/0x1e0
[    3.110085]  mlx5e_create_cq+0x15c/0x1e0
[    3.110085]  mlx5e_open_drop_rq+0xea/0x190
[    3.110085]  mlx5e_attach_netdev+0x53/0x140
[    3.110085]  mlx5e_attach+0x3d/0x60
[    3.110085]  mlx5e_add+0x11d/0x2f0
[    3.110085]  mlx5_add_device+0x77/0x170
[    3.110085]  mlx5_register_interface+0x74/0xc0
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  init+0x67/0x72
[    3.110085]  ? mlx4_en_init_ptys2ethtool_map+0x346/0x346
[    3.110085]  do_one_initcall+0x98/0x147
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  kernel_init_freeable+0x164/0x1e0
[    3.110085]  ? rest_init+0xb0/0xb0
[    3.110085]  kernel_init+0xa/0x100
[    3.110085]  ret_from_fork+0x35/0x40
[    3.110085] Code: 00 00 00 00 e8 ab ff ff ff 84 c0 74 02 f3 c3 80 3d 3b c3 64 01 00 75 f5 48 c7 c7 68 0b 81 98 c6 05 2b c3 64 01 01 e8 79 d7 a3 ff <0f> ff c3 66 0f 1f 44 00 00 8b 06 83 f8 ff 74 39 31 c9 39 f8 89
[    3.110085] ---[ end trace a0068e1c68438a74 ]---

Fixes: f105b45 ("net/mlx5: CQ hold/put API")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Mar 1, 2018
The kernel compiled with CONFIG_REFCOUNT_FULL produces the following
error. The reason to it that initial value of refcount_t is supposed
to be more than 0, change it.

[    3.106634] ------------[ cut here ]------------
[    3.107756] refcount_t: increment on 0; use-after-free.
[    3.109130] WARNING: CPU: 0 PID: 1 at lib/refcount.c:153 refcount_inc+0x27/0x30
[    3.110085] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1-00028-gf683e04bdccc torvalds#137
[    3.110085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[    3.110085] RIP: 0010:refcount_inc+0x27/0x30
[    3.110085] RSP: 0000:ffffaa620000fba0 EFLAGS: 00010286
[    3.110085] RAX: 0000000000000000 RBX: ffff9a6d1a1821c8 RCX: ffffffff98a50f48
[    3.110085] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000246
[    3.110085] RBP: ffff9a6d1ac800a0 R08: 0000000000000289 R09: 000000000000000a
[    3.110085] R10: fffff03bc0682840 R11: ffffffff9949856d R12: ffff9a6d1b4a4000
[    3.110085] R13: 0000000000000000 R14: ffff9a6d1a0a6c00 R15: ffffaa620000fc5c
[    3.110085] FS:  0000000000000000(0000) GS:ffff9a6d1fc00000(0000) knlGS:0000000000000000
[    3.110085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.110085] CR2: 0000000000000000 CR3: 000000000ba0a000 CR4: 00000000000006b0
[    3.110085] Call Trace:
[    3.110085]  mlx5_core_create_cq+0xde/0x250
[    3.110085]  ? __kmalloc+0x1ce/0x1e0
[    3.110085]  mlx5e_create_cq+0x15c/0x1e0
[    3.110085]  mlx5e_open_drop_rq+0xea/0x190
[    3.110085]  mlx5e_attach_netdev+0x53/0x140
[    3.110085]  mlx5e_attach+0x3d/0x60
[    3.110085]  mlx5e_add+0x11d/0x2f0
[    3.110085]  mlx5_add_device+0x77/0x170
[    3.110085]  mlx5_register_interface+0x74/0xc0
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  init+0x67/0x72
[    3.110085]  ? mlx4_en_init_ptys2ethtool_map+0x346/0x346
[    3.110085]  do_one_initcall+0x98/0x147
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  kernel_init_freeable+0x164/0x1e0
[    3.110085]  ? rest_init+0xb0/0xb0
[    3.110085]  kernel_init+0xa/0x100
[    3.110085]  ret_from_fork+0x35/0x40
[    3.110085] Code: 00 00 00 00 e8 ab ff ff ff 84 c0 74 02 f3 c3 80 3d 3b c3 64 01 00 75 f5 48 c7 c7 68 0b 81 98 c6 05 2b c3 64 01 01 e8 79 d7 a3 ff <0f> ff c3 66 0f 1f 44 00 00 8b 06 83 f8 ff 74 39 31 c9 39 f8 89
[    3.110085] ---[ end trace a0068e1c68438a74 ]---

Issue: 1318989
Change-Id: Idcf40a70fb5012429b5cb6bfdd893cd5f4b0e9b5
Fixes: f105b45 ("net/mlx5: CQ hold/put API")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Mar 5, 2018
The kernel compiled with CONFIG_REFCOUNT_FULL produces the following
error. The reason to it that initial value of refcount_t is supposed
to be more than 0, change it.

[    3.106634] ------------[ cut here ]------------
[    3.107756] refcount_t: increment on 0; use-after-free.
[    3.109130] WARNING: CPU: 0 PID: 1 at lib/refcount.c:153 refcount_inc+0x27/0x30
[    3.110085] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1-00028-gf683e04bdccc torvalds#137
[    3.110085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[    3.110085] RIP: 0010:refcount_inc+0x27/0x30
[    3.110085] RSP: 0000:ffffaa620000fba0 EFLAGS: 00010286
[    3.110085] RAX: 0000000000000000 RBX: ffff9a6d1a1821c8 RCX: ffffffff98a50f48
[    3.110085] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000246
[    3.110085] RBP: ffff9a6d1ac800a0 R08: 0000000000000289 R09: 000000000000000a
[    3.110085] R10: fffff03bc0682840 R11: ffffffff9949856d R12: ffff9a6d1b4a4000
[    3.110085] R13: 0000000000000000 R14: ffff9a6d1a0a6c00 R15: ffffaa620000fc5c
[    3.110085] FS:  0000000000000000(0000) GS:ffff9a6d1fc00000(0000) knlGS:0000000000000000
[    3.110085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.110085] CR2: 0000000000000000 CR3: 000000000ba0a000 CR4: 00000000000006b0
[    3.110085] Call Trace:
[    3.110085]  mlx5_core_create_cq+0xde/0x250
[    3.110085]  ? __kmalloc+0x1ce/0x1e0
[    3.110085]  mlx5e_create_cq+0x15c/0x1e0
[    3.110085]  mlx5e_open_drop_rq+0xea/0x190
[    3.110085]  mlx5e_attach_netdev+0x53/0x140
[    3.110085]  mlx5e_attach+0x3d/0x60
[    3.110085]  mlx5e_add+0x11d/0x2f0
[    3.110085]  mlx5_add_device+0x77/0x170
[    3.110085]  mlx5_register_interface+0x74/0xc0
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  init+0x67/0x72
[    3.110085]  ? mlx4_en_init_ptys2ethtool_map+0x346/0x346
[    3.110085]  do_one_initcall+0x98/0x147
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  kernel_init_freeable+0x164/0x1e0
[    3.110085]  ? rest_init+0xb0/0xb0
[    3.110085]  kernel_init+0xa/0x100
[    3.110085]  ret_from_fork+0x35/0x40
[    3.110085] Code: 00 00 00 00 e8 ab ff ff ff 84 c0 74 02 f3 c3 80 3d 3b c3 64 01 00 75 f5 48 c7 c7 68 0b 81 98 c6 05 2b c3 64 01 01 e8 79 d7 a3 ff <0f> ff c3 66 0f 1f 44 00 00 8b 06 83 f8 ff 74 39 31 c9 39 f8 89
[    3.110085] ---[ end trace a0068e1c68438a74 ]---

Fixes: f105b45 ("net/mlx5: CQ hold/put API")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Mar 6, 2018
The kernel compiled with CONFIG_REFCOUNT_FULL produces the following
error. The reason to it that initial value of refcount_t is supposed
to be more than 0, change it.

[    3.106634] ------------[ cut here ]------------
[    3.107756] refcount_t: increment on 0; use-after-free.
[    3.109130] WARNING: CPU: 0 PID: 1 at lib/refcount.c:153 refcount_inc+0x27/0x30
[    3.110085] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1-00028-gf683e04bdccc torvalds#137
[    3.110085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[    3.110085] RIP: 0010:refcount_inc+0x27/0x30
[    3.110085] RSP: 0000:ffffaa620000fba0 EFLAGS: 00010286
[    3.110085] RAX: 0000000000000000 RBX: ffff9a6d1a1821c8 RCX: ffffffff98a50f48
[    3.110085] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000246
[    3.110085] RBP: ffff9a6d1ac800a0 R08: 0000000000000289 R09: 000000000000000a
[    3.110085] R10: fffff03bc0682840 R11: ffffffff9949856d R12: ffff9a6d1b4a4000
[    3.110085] R13: 0000000000000000 R14: ffff9a6d1a0a6c00 R15: ffffaa620000fc5c
[    3.110085] FS:  0000000000000000(0000) GS:ffff9a6d1fc00000(0000) knlGS:0000000000000000
[    3.110085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.110085] CR2: 0000000000000000 CR3: 000000000ba0a000 CR4: 00000000000006b0
[    3.110085] Call Trace:
[    3.110085]  mlx5_core_create_cq+0xde/0x250
[    3.110085]  ? __kmalloc+0x1ce/0x1e0
[    3.110085]  mlx5e_create_cq+0x15c/0x1e0
[    3.110085]  mlx5e_open_drop_rq+0xea/0x190
[    3.110085]  mlx5e_attach_netdev+0x53/0x140
[    3.110085]  mlx5e_attach+0x3d/0x60
[    3.110085]  mlx5e_add+0x11d/0x2f0
[    3.110085]  mlx5_add_device+0x77/0x170
[    3.110085]  mlx5_register_interface+0x74/0xc0
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  init+0x67/0x72
[    3.110085]  ? mlx4_en_init_ptys2ethtool_map+0x346/0x346
[    3.110085]  do_one_initcall+0x98/0x147
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  kernel_init_freeable+0x164/0x1e0
[    3.110085]  ? rest_init+0xb0/0xb0
[    3.110085]  kernel_init+0xa/0x100
[    3.110085]  ret_from_fork+0x35/0x40
[    3.110085] Code: 00 00 00 00 e8 ab ff ff ff 84 c0 74 02 f3 c3 80 3d 3b c3 64 01 00 75 f5 48 c7 c7 68 0b 81 98 c6 05 2b c3 64 01 01 e8 79 d7 a3 ff <0f> ff c3 66 0f 1f 44 00 00 8b 06 83 f8 ff 74 39 31 c9 39 f8 89
[    3.110085] ---[ end trace a0068e1c68438a74 ]---

Fixes: f105b45 ("net/mlx5: CQ hold/put API")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Mar 8, 2018
The kernel compiled with CONFIG_REFCOUNT_FULL produces the following
error. The reason to it that initial value of refcount_t is supposed
to be more than 0, change it.

[    3.106634] ------------[ cut here ]------------
[    3.107756] refcount_t: increment on 0; use-after-free.
[    3.109130] WARNING: CPU: 0 PID: 1 at lib/refcount.c:153 refcount_inc+0x27/0x30
[    3.110085] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1-00028-gf683e04bdccc torvalds#137
[    3.110085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[    3.110085] RIP: 0010:refcount_inc+0x27/0x30
[    3.110085] RSP: 0000:ffffaa620000fba0 EFLAGS: 00010286
[    3.110085] RAX: 0000000000000000 RBX: ffff9a6d1a1821c8 RCX: ffffffff98a50f48
[    3.110085] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000246
[    3.110085] RBP: ffff9a6d1ac800a0 R08: 0000000000000289 R09: 000000000000000a
[    3.110085] R10: fffff03bc0682840 R11: ffffffff9949856d R12: ffff9a6d1b4a4000
[    3.110085] R13: 0000000000000000 R14: ffff9a6d1a0a6c00 R15: ffffaa620000fc5c
[    3.110085] FS:  0000000000000000(0000) GS:ffff9a6d1fc00000(0000) knlGS:0000000000000000
[    3.110085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.110085] CR2: 0000000000000000 CR3: 000000000ba0a000 CR4: 00000000000006b0
[    3.110085] Call Trace:
[    3.110085]  mlx5_core_create_cq+0xde/0x250
[    3.110085]  ? __kmalloc+0x1ce/0x1e0
[    3.110085]  mlx5e_create_cq+0x15c/0x1e0
[    3.110085]  mlx5e_open_drop_rq+0xea/0x190
[    3.110085]  mlx5e_attach_netdev+0x53/0x140
[    3.110085]  mlx5e_attach+0x3d/0x60
[    3.110085]  mlx5e_add+0x11d/0x2f0
[    3.110085]  mlx5_add_device+0x77/0x170
[    3.110085]  mlx5_register_interface+0x74/0xc0
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  init+0x67/0x72
[    3.110085]  ? mlx4_en_init_ptys2ethtool_map+0x346/0x346
[    3.110085]  do_one_initcall+0x98/0x147
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  kernel_init_freeable+0x164/0x1e0
[    3.110085]  ? rest_init+0xb0/0xb0
[    3.110085]  kernel_init+0xa/0x100
[    3.110085]  ret_from_fork+0x35/0x40
[    3.110085] Code: 00 00 00 00 e8 ab ff ff ff 84 c0 74 02 f3 c3 80 3d 3b c3 64 01 00 75 f5 48 c7 c7 68 0b 81 98 c6 05 2b c3 64 01 01 e8 79 d7 a3 ff <0f> ff c3 66 0f 1f 44 00 00 8b 06 83 f8 ff 74 39 31 c9 39 f8 89
[    3.110085] ---[ end trace a0068e1c68438a74 ]---

Fixes: f105b45 ("net/mlx5: CQ hold/put API")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Mar 9, 2018
The kernel compiled with CONFIG_REFCOUNT_FULL produces the following
error. The reason to it that initial value of refcount_t is supposed
to be more than 0, change it.

[    3.106634] ------------[ cut here ]------------
[    3.107756] refcount_t: increment on 0; use-after-free.
[    3.109130] WARNING: CPU: 0 PID: 1 at lib/refcount.c:153 refcount_inc+0x27/0x30
[    3.110085] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1-00028-gf683e04bdccc torvalds#137
[    3.110085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[    3.110085] RIP: 0010:refcount_inc+0x27/0x30
[    3.110085] RSP: 0000:ffffaa620000fba0 EFLAGS: 00010286
[    3.110085] RAX: 0000000000000000 RBX: ffff9a6d1a1821c8 RCX: ffffffff98a50f48
[    3.110085] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000246
[    3.110085] RBP: ffff9a6d1ac800a0 R08: 0000000000000289 R09: 000000000000000a
[    3.110085] R10: fffff03bc0682840 R11: ffffffff9949856d R12: ffff9a6d1b4a4000
[    3.110085] R13: 0000000000000000 R14: ffff9a6d1a0a6c00 R15: ffffaa620000fc5c
[    3.110085] FS:  0000000000000000(0000) GS:ffff9a6d1fc00000(0000) knlGS:0000000000000000
[    3.110085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.110085] CR2: 0000000000000000 CR3: 000000000ba0a000 CR4: 00000000000006b0
[    3.110085] Call Trace:
[    3.110085]  mlx5_core_create_cq+0xde/0x250
[    3.110085]  ? __kmalloc+0x1ce/0x1e0
[    3.110085]  mlx5e_create_cq+0x15c/0x1e0
[    3.110085]  mlx5e_open_drop_rq+0xea/0x190
[    3.110085]  mlx5e_attach_netdev+0x53/0x140
[    3.110085]  mlx5e_attach+0x3d/0x60
[    3.110085]  mlx5e_add+0x11d/0x2f0
[    3.110085]  mlx5_add_device+0x77/0x170
[    3.110085]  mlx5_register_interface+0x74/0xc0
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  init+0x67/0x72
[    3.110085]  ? mlx4_en_init_ptys2ethtool_map+0x346/0x346
[    3.110085]  do_one_initcall+0x98/0x147
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  kernel_init_freeable+0x164/0x1e0
[    3.110085]  ? rest_init+0xb0/0xb0
[    3.110085]  kernel_init+0xa/0x100
[    3.110085]  ret_from_fork+0x35/0x40
[    3.110085] Code: 00 00 00 00 e8 ab ff ff ff 84 c0 74 02 f3 c3 80 3d 3b c3 64 01 00 75 f5 48 c7 c7 68 0b 81 98 c6 05 2b c3 64 01 01 e8 79 d7 a3 ff <0f> ff c3 66 0f 1f 44 00 00 8b 06 83 f8 ff 74 39 31 c9 39 f8 89
[    3.110085] ---[ end trace a0068e1c68438a74 ]---

Fixes: f105b45 ("net/mlx5: CQ hold/put API")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Sep 2, 2018
After releasing ucontext the __mmu_notifier_release will be called
again in exit_mmap path. However at that time the driver ucontext
(mlx5_ib_ucontext) already will be freed and it will cause
to use-after-free error, due to improper use of mmu_notifier API.

Convert UMEM ODP to use mmu_notify unregister flow with delayed memory
resource freeing.

==================================================================
[  335.696162] BUG: KASAN: use-after-free in __mmu_notifier_release+0x13f/0x450
[  335.696818] Read of size 8 at addr ffff8801218b9bd0 by task a.out/387
[  335.697358]
[  335.697461] CPU: 2 PID: 387 Comm: a.out Not tainted 4.19.0-rc1+ torvalds#137
[  335.697844] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[  335.698939] Call Trace:
[  335.699183]  dump_stack+0xf0/0x19b
[  335.700798]  print_address_description+0x73/0x280
[  335.702129]  kasan_report+0x258/0x380
[  335.702572]  __mmu_notifier_release+0x13f/0x450
[  335.708154]  exit_mmap+0x241/0x280
[  335.710134]  mmput+0x133/0x330
[  335.714691]  do_exit+0xf5e/0x1350
[  335.728976]  do_group_exit+0xe0/0x1c0
[  335.729911]  get_signal+0x447/0xde0
[  335.732720]  do_signal+0x96/0xb50
[  335.738739]  exit_to_usermode_loop+0x163/0x1b0
[  335.741891]  do_syscall_64+0x35c/0x370
[  335.744658]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.745638] RIP: 0033:0x7fa8e124adf9
[  335.745909] Code: Bad RIP value.
[  335.746187] RSP: 002b:00007fa8e1949e98 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[  335.746736] RAX: 0000000000000038 RBX: 0000000000000000 RCX: 00007fa8e124adf9
[  335.747122] RDX: 0000000000000038 RSI: 00000000200000c0 RDI: 0000000000000003
[  335.747377] RBP: 00007fa8e1949ec0 R08: 0000000000000000 R09: 0000000000000000
[  335.749405] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffda1662cde
[  335.749700] R13: 00007ffda1662cdf R14: 00007ffda1662d70 R15: 00007ffda1662d70
[  335.749974]
[  335.750077] Allocated by task 387:
[  335.750221]  kasan_kmalloc+0xa0/0xd0
[  335.750374]  kmem_cache_alloc_trace+0x134/0x2c0
[  335.752228]  mlx5_ib_alloc_ucontext+0x501/0x1530 [mlx5_ib]
[  335.752402]  ib_uverbs_get_context+0x240/0x840 [ib_uverbs]
[  335.752565]  ib_uverbs_write+0x57c/0x930 [ib_uverbs]
[  335.752723]  __vfs_write+0xc4/0x3c0
[  335.753128]  vfs_write+0xff/0x250
[  335.753394]  ksys_write+0xb6/0x140
[  335.753804]  do_syscall_64+0x105/0x370
[  335.753924]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.754637]
[  335.754724] Freed by task 387:
[  335.754892]  __kasan_slab_free+0x12e/0x180
[  335.755155]  kfree+0x121/0x2e0
[  335.755432]  mlx5_ib_dealloc_ucontext+0x94/0xa0 [mlx5_ib]
[  335.755879]  uverbs_destroy_ufile_hw+0x22b/0x410 [ib_uverbs]
[  335.756788]  ib_uverbs_close+0xd9/0x260 [ib_uverbs]
[  335.756953]  __fput+0x210/0x3d0
[  335.757075]  task_work_run+0x13d/0x1a0
[  335.757484]  exit_to_usermode_loop+0x198/0x1b0
[  335.757647]  do_syscall_64+0x35c/0x370
[  335.757770]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.758650]
[  335.758747] The buggy address belongs to the object at ffff8801218b9ae8
[  335.758747]  which belongs to the cache kmalloc-1024 of size 1024
[  335.759437] The buggy address is located 232 bytes inside of
[  335.759437]  1024-byte region [ffff8801218b9ae8, ffff8801218b9ee8)
[  335.760087] The buggy address belongs to the page:
[  335.760398] page:ffffea0004862e00 count:1 mapcount:0 mapping:ffff880122c0ef00 index:0x0 compound_mapcount: 0
[  335.761552] flags: 0x8000000000008100(slab|head)
[  335.761713] raw: 8000000000008100 ffffea000487dc08 ffffea0004895408 ffff880122c0ef00
[  335.762657] raw: 0000000000000000 0000000000170017 00000001ffffffff 0000000000000000
[  335.762891] page dumped because: kasan: bad access detected
[  335.763057]
[  335.763140] Memory state around the buggy address:
[  335.763742]  ffff8801218b9a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
[  335.764272]  ffff8801218b9b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.764930] >ffff8801218b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765177]                                                  ^
[  335.765513]  ffff8801218b9c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765755]  ffff8801218b9c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.767364]
==================================================================

Cc: <stable@vger.kernel.org> # 3.19
Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Fixes: 882214e ("IB/core: Implement support for MMU notifiers regarding on demand paging regions")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Sep 3, 2018
After releasing ucontext the __mmu_notifier_release will be called
again in exit_mmap path. However at that time the driver ucontext
(mlx5_ib_ucontext) already will be freed and it will cause
to use-after-free error, due to improper use of mmu_notifier API.

Convert UMEM ODP to use mmu_notify unregister flow with delayed memory
resource freeing.

==================================================================
[  335.696162] BUG: KASAN: use-after-free in __mmu_notifier_release+0x13f/0x450
[  335.696818] Read of size 8 at addr ffff8801218b9bd0 by task a.out/387
[  335.697358]
[  335.697461] CPU: 2 PID: 387 Comm: a.out Not tainted 4.19.0-rc1+ torvalds#137
[  335.697844] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[  335.698939] Call Trace:
[  335.699183]  dump_stack+0xf0/0x19b
[  335.700798]  print_address_description+0x73/0x280
[  335.702129]  kasan_report+0x258/0x380
[  335.702572]  __mmu_notifier_release+0x13f/0x450
[  335.708154]  exit_mmap+0x241/0x280
[  335.710134]  mmput+0x133/0x330
[  335.714691]  do_exit+0xf5e/0x1350
[  335.728976]  do_group_exit+0xe0/0x1c0
[  335.729911]  get_signal+0x447/0xde0
[  335.732720]  do_signal+0x96/0xb50
[  335.738739]  exit_to_usermode_loop+0x163/0x1b0
[  335.741891]  do_syscall_64+0x35c/0x370
[  335.744658]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.745638] RIP: 0033:0x7fa8e124adf9
[  335.745909] Code: Bad RIP value.
[  335.746187] RSP: 002b:00007fa8e1949e98 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[  335.746736] RAX: 0000000000000038 RBX: 0000000000000000 RCX: 00007fa8e124adf9
[  335.747122] RDX: 0000000000000038 RSI: 00000000200000c0 RDI: 0000000000000003
[  335.747377] RBP: 00007fa8e1949ec0 R08: 0000000000000000 R09: 0000000000000000
[  335.749405] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffda1662cde
[  335.749700] R13: 00007ffda1662cdf R14: 00007ffda1662d70 R15: 00007ffda1662d70
[  335.749974]
[  335.750077] Allocated by task 387:
[  335.750221]  kasan_kmalloc+0xa0/0xd0
[  335.750374]  kmem_cache_alloc_trace+0x134/0x2c0
[  335.752228]  mlx5_ib_alloc_ucontext+0x501/0x1530 [mlx5_ib]
[  335.752402]  ib_uverbs_get_context+0x240/0x840 [ib_uverbs]
[  335.752565]  ib_uverbs_write+0x57c/0x930 [ib_uverbs]
[  335.752723]  __vfs_write+0xc4/0x3c0
[  335.753128]  vfs_write+0xff/0x250
[  335.753394]  ksys_write+0xb6/0x140
[  335.753804]  do_syscall_64+0x105/0x370
[  335.753924]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.754637]
[  335.754724] Freed by task 387:
[  335.754892]  __kasan_slab_free+0x12e/0x180
[  335.755155]  kfree+0x121/0x2e0
[  335.755432]  mlx5_ib_dealloc_ucontext+0x94/0xa0 [mlx5_ib]
[  335.755879]  uverbs_destroy_ufile_hw+0x22b/0x410 [ib_uverbs]
[  335.756788]  ib_uverbs_close+0xd9/0x260 [ib_uverbs]
[  335.756953]  __fput+0x210/0x3d0
[  335.757075]  task_work_run+0x13d/0x1a0
[  335.757484]  exit_to_usermode_loop+0x198/0x1b0
[  335.757647]  do_syscall_64+0x35c/0x370
[  335.757770]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.758650]
[  335.758747] The buggy address belongs to the object at ffff8801218b9ae8
[  335.758747]  which belongs to the cache kmalloc-1024 of size 1024
[  335.759437] The buggy address is located 232 bytes inside of
[  335.759437]  1024-byte region [ffff8801218b9ae8, ffff8801218b9ee8)
[  335.760087] The buggy address belongs to the page:
[  335.760398] page:ffffea0004862e00 count:1 mapcount:0 mapping:ffff880122c0ef00 index:0x0 compound_mapcount: 0
[  335.761552] flags: 0x8000000000008100(slab|head)
[  335.761713] raw: 8000000000008100 ffffea000487dc08 ffffea0004895408 ffff880122c0ef00
[  335.762657] raw: 0000000000000000 0000000000170017 00000001ffffffff 0000000000000000
[  335.762891] page dumped because: kasan: bad access detected
[  335.763057]
[  335.763140] Memory state around the buggy address:
[  335.763742]  ffff8801218b9a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
[  335.764272]  ffff8801218b9b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.764930] >ffff8801218b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765177]                                                  ^
[  335.765513]  ffff8801218b9c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765755]  ffff8801218b9c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.767364]
==================================================================

Cc: <stable@vger.kernel.org> # 3.19
Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Fixes: 882214e ("IB/core: Implement support for MMU notifiers regarding on demand paging regions")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Sep 6, 2018
After releasing ucontext the __mmu_notifier_release will be called
again in exit_mmap path. However at that time the driver ucontext
(mlx5_ib_ucontext) already will be freed and it will cause
to use-after-free error, due to improper use of mmu_notifier API.

Convert UMEM ODP to use mmu_notify unregister flow with delayed memory
resource freeing.

==================================================================
[  335.696162] BUG: KASAN: use-after-free in __mmu_notifier_release+0x13f/0x450
[  335.696818] Read of size 8 at addr ffff8801218b9bd0 by task a.out/387
[  335.697358]
[  335.697461] CPU: 2 PID: 387 Comm: a.out Not tainted 4.19.0-rc1+ torvalds#137
[  335.697844] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[  335.698939] Call Trace:
[  335.699183]  dump_stack+0xf0/0x19b
[  335.700798]  print_address_description+0x73/0x280
[  335.702129]  kasan_report+0x258/0x380
[  335.702572]  __mmu_notifier_release+0x13f/0x450
[  335.708154]  exit_mmap+0x241/0x280
[  335.710134]  mmput+0x133/0x330
[  335.714691]  do_exit+0xf5e/0x1350
[  335.728976]  do_group_exit+0xe0/0x1c0
[  335.729911]  get_signal+0x447/0xde0
[  335.732720]  do_signal+0x96/0xb50
[  335.738739]  exit_to_usermode_loop+0x163/0x1b0
[  335.741891]  do_syscall_64+0x35c/0x370
[  335.744658]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.745638] RIP: 0033:0x7fa8e124adf9
[  335.745909] Code: Bad RIP value.
[  335.746187] RSP: 002b:00007fa8e1949e98 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[  335.746736] RAX: 0000000000000038 RBX: 0000000000000000 RCX: 00007fa8e124adf9
[  335.747122] RDX: 0000000000000038 RSI: 00000000200000c0 RDI: 0000000000000003
[  335.747377] RBP: 00007fa8e1949ec0 R08: 0000000000000000 R09: 0000000000000000
[  335.749405] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffda1662cde
[  335.749700] R13: 00007ffda1662cdf R14: 00007ffda1662d70 R15: 00007ffda1662d70
[  335.749974]
[  335.750077] Allocated by task 387:
[  335.750221]  kasan_kmalloc+0xa0/0xd0
[  335.750374]  kmem_cache_alloc_trace+0x134/0x2c0
[  335.752228]  mlx5_ib_alloc_ucontext+0x501/0x1530 [mlx5_ib]
[  335.752402]  ib_uverbs_get_context+0x240/0x840 [ib_uverbs]
[  335.752565]  ib_uverbs_write+0x57c/0x930 [ib_uverbs]
[  335.752723]  __vfs_write+0xc4/0x3c0
[  335.753128]  vfs_write+0xff/0x250
[  335.753394]  ksys_write+0xb6/0x140
[  335.753804]  do_syscall_64+0x105/0x370
[  335.753924]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.754637]
[  335.754724] Freed by task 387:
[  335.754892]  __kasan_slab_free+0x12e/0x180
[  335.755155]  kfree+0x121/0x2e0
[  335.755432]  mlx5_ib_dealloc_ucontext+0x94/0xa0 [mlx5_ib]
[  335.755879]  uverbs_destroy_ufile_hw+0x22b/0x410 [ib_uverbs]
[  335.756788]  ib_uverbs_close+0xd9/0x260 [ib_uverbs]
[  335.756953]  __fput+0x210/0x3d0
[  335.757075]  task_work_run+0x13d/0x1a0
[  335.757484]  exit_to_usermode_loop+0x198/0x1b0
[  335.757647]  do_syscall_64+0x35c/0x370
[  335.757770]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.758650]
[  335.758747] The buggy address belongs to the object at ffff8801218b9ae8
[  335.758747]  which belongs to the cache kmalloc-1024 of size 1024
[  335.759437] The buggy address is located 232 bytes inside of
[  335.759437]  1024-byte region [ffff8801218b9ae8, ffff8801218b9ee8)
[  335.760087] The buggy address belongs to the page:
[  335.760398] page:ffffea0004862e00 count:1 mapcount:0 mapping:ffff880122c0ef00 index:0x0 compound_mapcount: 0
[  335.761552] flags: 0x8000000000008100(slab|head)
[  335.761713] raw: 8000000000008100 ffffea000487dc08 ffffea0004895408 ffff880122c0ef00
[  335.762657] raw: 0000000000000000 0000000000170017 00000001ffffffff 0000000000000000
[  335.762891] page dumped because: kasan: bad access detected
[  335.763057]
[  335.763140] Memory state around the buggy address:
[  335.763742]  ffff8801218b9a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
[  335.764272]  ffff8801218b9b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.764930] >ffff8801218b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765177]                                                  ^
[  335.765513]  ffff8801218b9c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765755]  ffff8801218b9c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.767364]
==================================================================

Cc: <stable@vger.kernel.org> # 3.19
Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Fixes: 882214e ("IB/core: Implement support for MMU notifiers regarding on demand paging regions")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Sep 7, 2018
After releasing ucontext the __mmu_notifier_release will be called
again in exit_mmap path. However at that time the driver ucontext
(mlx5_ib_ucontext) already will be freed and it will cause
to use-after-free error, due to improper use of mmu_notifier API.

Convert UMEM ODP to use mmu_notify unregister flow with delayed memory
resource freeing.

==================================================================
[  335.696162] BUG: KASAN: use-after-free in __mmu_notifier_release+0x13f/0x450
[  335.696818] Read of size 8 at addr ffff8801218b9bd0 by task a.out/387
[  335.697358]
[  335.697461] CPU: 2 PID: 387 Comm: a.out Not tainted 4.19.0-rc1+ torvalds#137
[  335.697844] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[  335.698939] Call Trace:
[  335.699183]  dump_stack+0xf0/0x19b
[  335.700798]  print_address_description+0x73/0x280
[  335.702129]  kasan_report+0x258/0x380
[  335.702572]  __mmu_notifier_release+0x13f/0x450
[  335.708154]  exit_mmap+0x241/0x280
[  335.710134]  mmput+0x133/0x330
[  335.714691]  do_exit+0xf5e/0x1350
[  335.728976]  do_group_exit+0xe0/0x1c0
[  335.729911]  get_signal+0x447/0xde0
[  335.732720]  do_signal+0x96/0xb50
[  335.738739]  exit_to_usermode_loop+0x163/0x1b0
[  335.741891]  do_syscall_64+0x35c/0x370
[  335.744658]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.745638] RIP: 0033:0x7fa8e124adf9
[  335.745909] Code: Bad RIP value.
[  335.746187] RSP: 002b:00007fa8e1949e98 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[  335.746736] RAX: 0000000000000038 RBX: 0000000000000000 RCX: 00007fa8e124adf9
[  335.747122] RDX: 0000000000000038 RSI: 00000000200000c0 RDI: 0000000000000003
[  335.747377] RBP: 00007fa8e1949ec0 R08: 0000000000000000 R09: 0000000000000000
[  335.749405] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffda1662cde
[  335.749700] R13: 00007ffda1662cdf R14: 00007ffda1662d70 R15: 00007ffda1662d70
[  335.749974]
[  335.750077] Allocated by task 387:
[  335.750221]  kasan_kmalloc+0xa0/0xd0
[  335.750374]  kmem_cache_alloc_trace+0x134/0x2c0
[  335.752228]  mlx5_ib_alloc_ucontext+0x501/0x1530 [mlx5_ib]
[  335.752402]  ib_uverbs_get_context+0x240/0x840 [ib_uverbs]
[  335.752565]  ib_uverbs_write+0x57c/0x930 [ib_uverbs]
[  335.752723]  __vfs_write+0xc4/0x3c0
[  335.753128]  vfs_write+0xff/0x250
[  335.753394]  ksys_write+0xb6/0x140
[  335.753804]  do_syscall_64+0x105/0x370
[  335.753924]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.754637]
[  335.754724] Freed by task 387:
[  335.754892]  __kasan_slab_free+0x12e/0x180
[  335.755155]  kfree+0x121/0x2e0
[  335.755432]  mlx5_ib_dealloc_ucontext+0x94/0xa0 [mlx5_ib]
[  335.755879]  uverbs_destroy_ufile_hw+0x22b/0x410 [ib_uverbs]
[  335.756788]  ib_uverbs_close+0xd9/0x260 [ib_uverbs]
[  335.756953]  __fput+0x210/0x3d0
[  335.757075]  task_work_run+0x13d/0x1a0
[  335.757484]  exit_to_usermode_loop+0x198/0x1b0
[  335.757647]  do_syscall_64+0x35c/0x370
[  335.757770]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.758650]
[  335.758747] The buggy address belongs to the object at ffff8801218b9ae8
[  335.758747]  which belongs to the cache kmalloc-1024 of size 1024
[  335.759437] The buggy address is located 232 bytes inside of
[  335.759437]  1024-byte region [ffff8801218b9ae8, ffff8801218b9ee8)
[  335.760087] The buggy address belongs to the page:
[  335.760398] page:ffffea0004862e00 count:1 mapcount:0 mapping:ffff880122c0ef00 index:0x0 compound_mapcount: 0
[  335.761552] flags: 0x8000000000008100(slab|head)
[  335.761713] raw: 8000000000008100 ffffea000487dc08 ffffea0004895408 ffff880122c0ef00
[  335.762657] raw: 0000000000000000 0000000000170017 00000001ffffffff 0000000000000000
[  335.762891] page dumped because: kasan: bad access detected
[  335.763057]
[  335.763140] Memory state around the buggy address:
[  335.763742]  ffff8801218b9a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
[  335.764272]  ffff8801218b9b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.764930] >ffff8801218b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765177]                                                  ^
[  335.765513]  ffff8801218b9c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765755]  ffff8801218b9c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.767364]
==================================================================

Cc: <stable@vger.kernel.org> # 3.19
Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Fixes: 882214e ("IB/core: Implement support for MMU notifiers regarding on demand paging regions")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
alaahl pushed a commit to alaahl/linux that referenced this pull request Sep 12, 2018
After releasing ucontext the __mmu_notifier_release will be called
again in exit_mmap path. However at that time the driver ucontext
(mlx5_ib_ucontext) already will be freed and it will cause
to use-after-free error, due to improper use of mmu_notifier API.

Convert UMEM ODP to use mmu_notify unregister flow with delayed memory
resource freeing.

==================================================================
[  335.696162] BUG: KASAN: use-after-free in __mmu_notifier_release+0x13f/0x450
[  335.696818] Read of size 8 at addr ffff8801218b9bd0 by task a.out/387
[  335.697358]
[  335.697461] CPU: 2 PID: 387 Comm: a.out Not tainted 4.19.0-rc1+ torvalds#137
[  335.697844] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[  335.698939] Call Trace:
[  335.699183]  dump_stack+0xf0/0x19b
[  335.700798]  print_address_description+0x73/0x280
[  335.702129]  kasan_report+0x258/0x380
[  335.702572]  __mmu_notifier_release+0x13f/0x450
[  335.708154]  exit_mmap+0x241/0x280
[  335.710134]  mmput+0x133/0x330
[  335.714691]  do_exit+0xf5e/0x1350
[  335.728976]  do_group_exit+0xe0/0x1c0
[  335.729911]  get_signal+0x447/0xde0
[  335.732720]  do_signal+0x96/0xb50
[  335.738739]  exit_to_usermode_loop+0x163/0x1b0
[  335.741891]  do_syscall_64+0x35c/0x370
[  335.744658]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.745638] RIP: 0033:0x7fa8e124adf9
[  335.745909] Code: Bad RIP value.
[  335.746187] RSP: 002b:00007fa8e1949e98 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[  335.746736] RAX: 0000000000000038 RBX: 0000000000000000 RCX: 00007fa8e124adf9
[  335.747122] RDX: 0000000000000038 RSI: 00000000200000c0 RDI: 0000000000000003
[  335.747377] RBP: 00007fa8e1949ec0 R08: 0000000000000000 R09: 0000000000000000
[  335.749405] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffda1662cde
[  335.749700] R13: 00007ffda1662cdf R14: 00007ffda1662d70 R15: 00007ffda1662d70
[  335.749974]
[  335.750077] Allocated by task 387:
[  335.750221]  kasan_kmalloc+0xa0/0xd0
[  335.750374]  kmem_cache_alloc_trace+0x134/0x2c0
[  335.752228]  mlx5_ib_alloc_ucontext+0x501/0x1530 [mlx5_ib]
[  335.752402]  ib_uverbs_get_context+0x240/0x840 [ib_uverbs]
[  335.752565]  ib_uverbs_write+0x57c/0x930 [ib_uverbs]
[  335.752723]  __vfs_write+0xc4/0x3c0
[  335.753128]  vfs_write+0xff/0x250
[  335.753394]  ksys_write+0xb6/0x140
[  335.753804]  do_syscall_64+0x105/0x370
[  335.753924]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.754637]
[  335.754724] Freed by task 387:
[  335.754892]  __kasan_slab_free+0x12e/0x180
[  335.755155]  kfree+0x121/0x2e0
[  335.755432]  mlx5_ib_dealloc_ucontext+0x94/0xa0 [mlx5_ib]
[  335.755879]  uverbs_destroy_ufile_hw+0x22b/0x410 [ib_uverbs]
[  335.756788]  ib_uverbs_close+0xd9/0x260 [ib_uverbs]
[  335.756953]  __fput+0x210/0x3d0
[  335.757075]  task_work_run+0x13d/0x1a0
[  335.757484]  exit_to_usermode_loop+0x198/0x1b0
[  335.757647]  do_syscall_64+0x35c/0x370
[  335.757770]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  335.758650]
[  335.758747] The buggy address belongs to the object at ffff8801218b9ae8
[  335.758747]  which belongs to the cache kmalloc-1024 of size 1024
[  335.759437] The buggy address is located 232 bytes inside of
[  335.759437]  1024-byte region [ffff8801218b9ae8, ffff8801218b9ee8)
[  335.760087] The buggy address belongs to the page:
[  335.760398] page:ffffea0004862e00 count:1 mapcount:0 mapping:ffff880122c0ef00 index:0x0 compound_mapcount: 0
[  335.761552] flags: 0x8000000000008100(slab|head)
[  335.761713] raw: 8000000000008100 ffffea000487dc08 ffffea0004895408 ffff880122c0ef00
[  335.762657] raw: 0000000000000000 0000000000170017 00000001ffffffff 0000000000000000
[  335.762891] page dumped because: kasan: bad access detected
[  335.763057]
[  335.763140] Memory state around the buggy address:
[  335.763742]  ffff8801218b9a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
[  335.764272]  ffff8801218b9b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.764930] >ffff8801218b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765177]                                                  ^
[  335.765513]  ffff8801218b9c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.765755]  ffff8801218b9c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  335.767364]
==================================================================

Cc: <stable@vger.kernel.org> # 3.19
Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Fixes: 882214e ("IB/core: Implement support for MMU notifiers regarding on demand paging regions")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
roidayan pushed a commit to roidayan/linux that referenced this pull request Nov 19, 2018
The kernel compiled with CONFIG_REFCOUNT_FULL produces the following
error. The reason to it that initial value of refcount_t is supposed
to be more than 0, change it.

[    3.106634] ------------[ cut here ]------------
[    3.107756] refcount_t: increment on 0; use-after-free.
[    3.109130] WARNING: CPU: 0 PID: 1 at lib/refcount.c:153 refcount_inc+0x27/0x30
[    3.110085] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1-00028-gf683e04bdccc torvalds#137
[    3.110085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[    3.110085] RIP: 0010:refcount_inc+0x27/0x30
[    3.110085] RSP: 0000:ffffaa620000fba0 EFLAGS: 00010286
[    3.110085] RAX: 0000000000000000 RBX: ffff9a6d1a1821c8 RCX: ffffffff98a50f48
[    3.110085] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000246
[    3.110085] RBP: ffff9a6d1ac800a0 R08: 0000000000000289 R09: 000000000000000a
[    3.110085] R10: fffff03bc0682840 R11: ffffffff9949856d R12: ffff9a6d1b4a4000
[    3.110085] R13: 0000000000000000 R14: ffff9a6d1a0a6c00 R15: ffffaa620000fc5c
[    3.110085] FS:  0000000000000000(0000) GS:ffff9a6d1fc00000(0000) knlGS:0000000000000000
[    3.110085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.110085] CR2: 0000000000000000 CR3: 000000000ba0a000 CR4: 00000000000006b0
[    3.110085] Call Trace:
[    3.110085]  mlx5_core_create_cq+0xde/0x250
[    3.110085]  ? __kmalloc+0x1ce/0x1e0
[    3.110085]  mlx5e_create_cq+0x15c/0x1e0
[    3.110085]  mlx5e_open_drop_rq+0xea/0x190
[    3.110085]  mlx5e_attach_netdev+0x53/0x140
[    3.110085]  mlx5e_attach+0x3d/0x60
[    3.110085]  mlx5e_add+0x11d/0x2f0
[    3.110085]  mlx5_add_device+0x77/0x170
[    3.110085]  mlx5_register_interface+0x74/0xc0
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  init+0x67/0x72
[    3.110085]  ? mlx4_en_init_ptys2ethtool_map+0x346/0x346
[    3.110085]  do_one_initcall+0x98/0x147
[    3.110085]  ? set_debug_rodata+0x11/0x11
[    3.110085]  kernel_init_freeable+0x164/0x1e0
[    3.110085]  ? rest_init+0xb0/0xb0
[    3.110085]  kernel_init+0xa/0x100
[    3.110085]  ret_from_fork+0x35/0x40
[    3.110085] Code: 00 00 00 00 e8 ab ff ff ff 84 c0 74 02 f3 c3 80 3d 3b c3 64 01 00 75 f5 48 c7 c7 68 0b 81 98 c6 05 2b c3 64 01 01 e8 79 d7 a3 ff <0f> ff c3 66 0f 1f 44 00 00 8b 06 83 f8 ff 74 39 31 c9 39 f8 89
[    3.110085] ---[ end trace a0068e1c68438a74 ]---

Fixes: f105b45 ("net/mlx5: CQ hold/put API")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
plbossart added a commit to plbossart/sound that referenced this pull request Sep 4, 2019
We initialize work queues that aren't ever released, and sometimes we see
the dmesg trace below. Fix by cancelling workqueues.

[  270.446564] BUG: unable to handle page fault for address: ffffffffc075e0b0
[  270.446573] #PF: supervisor instruction fetch in kernel mode
[  270.446575] #PF: error_code(0x0010) - not-present page
[  270.446577] PGD 1e440c067 P4D 1e440c067 PUD 1e440e067 PMD 27c51d067 PTE 0
[  270.446582] Oops: 0010 [#2] SMP PTI
[  270.446586] CPU: 7 PID: 118 Comm: kworker/7:1 Tainted: G      D           5.3.0-rc7-test+ torvalds#137
[  270.446588] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake Y LPDDR4x T4 RVP TLC, BIOS ICLSFWR1.R00.3023.A01.1901100101 01/10/2019
[  270.446602] Workqueue: events_power_efficient 0xffffffffc075e0b0
[  270.446605] RIP: 0010:0xffffffffc075e0b0
[  270.446614] Code: Bad RIP value.
[  270.446616] RSP: 0018:ffffa452c033fe98 EFLAGS: 00010286
[  270.446618] RAX: ffffffffc075e0b0 RBX: ffffa13f65278870 RCX: ffffa13f67fe8260
[  270.446619] RDX: 0000000000000001 RSI: ffffa13f66c150b0 RDI: ffffa13f65278870
[  270.446620] RBP: ffffa13f67fe8240 R08: 0000000000000010 R09: 0000746e65696369
[  270.446622] R10: 8080808080808080 R11: 0000000000000018 R12: ffffa13f67fec500
[  270.446623] R13: 0000000000000000 R14: ffffa13f66161000 R15: 0ffffa13f67fec50
[  270.446625] FS:  0000000000000000(0000) GS:ffffa13f67fc0000(0000) knlGS:0000000000000000
[  270.446626] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  270.446628] CR2: ffffffffc075e086 CR3: 00000002a1c52001 CR4: 0000000000760ee0
[  270.446629] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  270.446630] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  270.446632] PKRU: 55555554
[  270.446634] Call Trace:
[  270.446645]  ? process_one_work+0x1f5/0x3b0
[  270.446648]  ? worker_thread+0x28/0x3c0
[  270.446650]  ? process_one_work+0x3b0/0x3b0
[  270.446652]  ? kthread+0x10d/0x130
[  270.446655]  ? kthread_create_on_node+0x60/0x60
[  270.446660]  ? ret_from_fork+0x35/0x40
[  270.446661] Modules linked in: soundwire_intel soundwire_cadence snd_soc_dmic soundwire_intel_init snd_soc_acpi regmap_sdw soundwire_bus snd_soc_max98357a snd_soc_wm8804_i2c snd_soc_wm8804 snd_soc_core snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore x86_pkg_temp_thermal intel_powerclamp mei_me mei iwlmvm asix usbnet iwlwifi i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm efivarfs sdhci_pci intel_lpss_pci cqhci xhci_pci intel_lpss mfd_core sdhci xhci_hcd [last unloaded: snd_soc_rt700]
[  270.446677] CR2: ffffffffc075e0b0
[  270.446685] ---[ end trace 6ce5dffcdd83023f ]---
[  270.446687] RIP: 0010:0xffffffffc062f0b0
[  270.446693] Code: ff ff 00 00 00 00 00 00 00 00 01 00 00 00 01 00 0c 00 f0 ff 2e c0 52 a4 ff ff 09 00 00 00 00 00 00 00 12 00 00 00 01 00 0c 00 <f9> ff 2e c0 52 a4 ff ff 0c 00 00 00 00 00 00 00 2a 00 00 00 01 00
[  270.446695] RSP: 0000:ffffa452c0303e98 EFLAGS: 00010286
[  270.446697] RAX: ffffffffc062f0b0 RBX: ffffa13f3c548070 RCX: ffffa13f67ea8260
[  270.446698] RDX: 0000000000000001 RSI: ffffa13f66c150b0 RDI: ffffa13f3c548070
[  270.446699] RBP: ffffa13f67ea8240 R08: 0000000000000010 R09: 0000746e65696369
[  270.446700] R10: 8080808080808080 R11: 0000000000000018 R12: ffffa13f67eac500
[  270.446702] R13: 0000000000000000 R14: ffffa13f667ac000 R15: 0ffffa13f67eac50
[  270.446703] FS:  0000000000000000(0000) GS:ffffa13f67fc0000(0000) knlGS:0000000000000000
[  270.446704] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  270.446705] CR2: ffffffffc075e086 CR3: 00000002a1c52001 CR4: 0000000000760ee0
[  270.446707] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  270.446708] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  270.446709] PKRU: 55555554

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
derkling pushed a commit to derkling/linux that referenced this pull request Sep 14, 2019
Currently it's assumed that firmware exports only the class of sensors
supported by the driver. However with newer firmware or SCPI protocol
revision, support for newer classes of sensors can be present.

The driver fails to probe with the following warning if an unsupported
class of sensor is encountered in the firmware.

sysfs: cannot create duplicate filename
	'/devices/platform/scpi/scpi:sensors/hwmon/hwmon0/'
------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:31
Modules linked in:

CPU: 0 PID: 6 Comm: kworker/u12:0 Not tainted 4.3.0-rc7 torvalds#137
Hardware name: ARM Juno development board (r0) (DT)
Workqueue: deferwq deferred_probe_work_func
PC is at sysfs_warn_dup+0x54/0x78
LR is at sysfs_warn_dup+0x54/0x78

This patch fixes the above issue by skipping through the unsupported
class of SCPI sensors.

Fixes: 68acc77 ("hwmon: Support thermal zones registration for SCP temperature sensors")
Fixes: ea98b29 ("hwmon: Support sensors exported via ARM SCP interface")
Cc: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
derkling pushed a commit to derkling/linux that referenced this pull request Sep 14, 2019
Currently it's assumed that firmware exports only the class of sensors
supported by the driver. However with newer firmware or SCPI protocol
revision, support for newer classes of sensors can be present.

The driver fails to probe with the following warning if an unsupported
class of sensor is encountered in the firmware.

sysfs: cannot create duplicate filename
	'/devices/platform/scpi/scpi:sensors/hwmon/hwmon0/'
------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:31
Modules linked in:

CPU: 0 PID: 6 Comm: kworker/u12:0 Not tainted 4.3.0-rc7 torvalds#137
Hardware name: ARM Juno development board (r0) (DT)
Workqueue: deferwq deferred_probe_work_func
PC is at sysfs_warn_dup+0x54/0x78
LR is at sysfs_warn_dup+0x54/0x78

This patch fixes the above issue by skipping through the unsupported
class of SCPI sensors.

Fixes: 68acc77 ("hwmon: Support thermal zones registration for SCP temperature sensors")
Fixes: ea98b29 ("hwmon: Support sensors exported via ARM SCP interface")
Cc: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
mjg59 pushed a commit to mjg59/linux that referenced this pull request Dec 29, 2019
In tipc_topsrv_kern_subscr() when s->tipc_conn_new() fails
we call tipc_close_conn() to clean up, but in this case
calling conn_put() is just enough.

This fixes the folllowing crash:

 kasan: GPF could be caused by NULL-ptr deref or user memory access
 general protection fault: 0000 [#1] SMP KASAN
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Modules linked in:
 CPU: 0 PID: 3085 Comm: syzkaller064164 Not tainted 4.15.0-rc1+ torvalds#137
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
 task: 00000000c24413a5 task.stack: 000000005e8160b5
 RIP: 0010:__lock_acquire+0xd55/0x47f0 kernel/locking/lockdep.c:3378
 RSP: 0018:ffff8801cb5474a8 EFLAGS: 00010002
 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff85ecb400
 RBP: ffff8801cb547830 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000000 R11: ffffffff87489d60 R12: ffff8801cd2980c0
 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000020
 FS:  00000000014ee880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007ffee2426e40 CR3: 00000001cb85a000 CR4: 00000000001406f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
  __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
  _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:175
  spin_lock_bh include/linux/spinlock.h:320 [inline]
  tipc_subscrb_subscrp_delete+0x8f/0x470 net/tipc/subscr.c:201
  tipc_subscrb_delete net/tipc/subscr.c:238 [inline]
  tipc_subscrb_release_cb+0x17/0x30 net/tipc/subscr.c:316
  tipc_close_conn+0x171/0x270 net/tipc/server.c:204
  tipc_topsrv_kern_subscr+0x724/0x810 net/tipc/server.c:514
  tipc_group_create+0x702/0x9c0 net/tipc/group.c:184
  tipc_sk_join net/tipc/socket.c:2747 [inline]
  tipc_setsockopt+0x249/0xc10 net/tipc/socket.c:2861
  SYSC_setsockopt net/socket.c:1851 [inline]
  SyS_setsockopt+0x189/0x360 net/socket.c:1830
  entry_SYSCALL_64_fastpath+0x1f/0x96

Fixes: 14c0449 ("tipc: add ability to order and receive topology events in driver")
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
intersectRaven pushed a commit to intersectRaven/linux that referenced this pull request Mar 7, 2021
[ Upstream commit e2f8b74 ]

It happened "Kernel panic - not syncing: hung_task: blocked tasks" when
test simulate crash and ifconfig down/rmmod meanwhile.

Test steps:

1.Test commands, either can reproduce the hang for PCIe, SDIO and SNOC.
echo soft > /sys/kernel/debug/ieee80211/phy0/ath10k/simulate_fw_crash;sleep 0.05;ifconfig wlan0 down
echo soft > /sys/kernel/debug/ieee80211/phy0/ath10k/simulate_fw_crash;rmmod ath10k_sdio
echo hw-restart > /sys/kernel/debug/ieee80211/phy0/ath10k/simulate_fw_crash;rmmod ath10k_pci

2. dmesg:
[ 5622.548630] ath10k_sdio mmc1:0001:1: simulating soft firmware crash
[ 5622.655995] ieee80211 phy0: Hardware restart was requested
[ 5776.355164] INFO: task shill:1572 blocked for more than 122 seconds.
[ 5776.355687] INFO: task kworker/1:2:24437 blocked for more than 122 seconds.
[ 5776.359812] Kernel panic - not syncing: hung_task: blocked tasks
[ 5776.359836] CPU: 1 PID: 55 Comm: khungtaskd Tainted: G        W         4.19.86 torvalds#137
[ 5776.359846] Hardware name: MediaTek krane sku176 board (DT)
[ 5776.359855] Call trace:
[ 5776.359868]  dump_backtrace+0x0/0x170
[ 5776.359881]  show_stack+0x20/0x2c
[ 5776.359896]  dump_stack+0xd4/0x10c
[ 5776.359916]  panic+0x12c/0x29c
[ 5776.359937]  hung_task_panic+0x0/0x50
[ 5776.359953]  kthread+0x120/0x130
[ 5776.359965]  ret_from_fork+0x10/0x18
[ 5776.359986] SMP: stopping secondary CPUs
[ 5776.360012] Kernel Offset: 0x141ea00000 from 0xffffff8008000000
[ 5776.360026] CPU features: 0x0,2188200c
[ 5776.360035] Memory Limit: none

command "ifconfig wlan0 down" or "rmmod ath10k_sdio" will be blocked
callstack of ifconfig:
[<0>] __switch_to+0x120/0x13c
[<0>] msleep+0x28/0x38
[<0>] ath10k_sdio_hif_stop+0x24c/0x294 [ath10k_sdio]
[<0>] ath10k_core_stop+0x50/0x78 [ath10k_core]
[<0>] ath10k_halt+0x120/0x178 [ath10k_core]
[<0>] ath10k_stop+0x4c/0x8c [ath10k_core]
[<0>] drv_stop+0xe0/0x1e4 [mac80211]
[<0>] ieee80211_stop_device+0x48/0x54 [mac80211]
[<0>] ieee80211_do_stop+0x678/0x6f8 [mac80211]
[<0>] ieee80211_stop+0x20/0x30 [mac80211]
[<0>] __dev_close_many+0xb8/0x11c
[<0>] __dev_change_flags+0xe0/0x1d0
[<0>] dev_change_flags+0x30/0x6c
[<0>] devinet_ioctl+0x370/0x564
[<0>] inet_ioctl+0xdc/0x304
[<0>] sock_do_ioctl+0x50/0x288
[<0>] compat_sock_ioctl+0x1b4/0x1aac
[<0>] __se_compat_sys_ioctl+0x100/0x26fc
[<0>] __arm64_compat_sys_ioctl+0x20/0x2c
[<0>] el0_svc_common+0xa4/0x154
[<0>] el0_svc_compat_handler+0x2c/0x38
[<0>] el0_svc_compat+0x8/0x18
[<0>] 0xffffffffffffffff

callstack of rmmod:
[<0>] __switch_to+0x120/0x13c
[<0>] msleep+0x28/0x38
[<0>] ath10k_sdio_hif_stop+0x294/0x31c [ath10k_sdio]
[<0>] ath10k_core_stop+0x50/0x78 [ath10k_core]
[<0>] ath10k_halt+0x120/0x178 [ath10k_core]
[<0>] ath10k_stop+0x4c/0x8c [ath10k_core]
[<0>] drv_stop+0xe0/0x1e4 [mac80211]
[<0>] ieee80211_stop_device+0x48/0x54 [mac80211]
[<0>] ieee80211_do_stop+0x678/0x6f8 [mac80211]
[<0>] ieee80211_stop+0x20/0x30 [mac80211]
[<0>] __dev_close_many+0xb8/0x11c
[<0>] dev_close_many+0x70/0x100
[<0>] dev_close+0x4c/0x80
[<0>] cfg80211_shutdown_all_interfaces+0x50/0xcc [cfg80211]
[<0>] ieee80211_remove_interfaces+0x58/0x1a0 [mac80211]
[<0>] ieee80211_unregister_hw+0x40/0x100 [mac80211]
[<0>] ath10k_mac_unregister+0x1c/0x44 [ath10k_core]
[<0>] ath10k_core_unregister+0x38/0x7c [ath10k_core]
[<0>] ath10k_sdio_remove+0x8c/0xd0 [ath10k_sdio]
[<0>] sdio_bus_remove+0x48/0x108
[<0>] device_release_driver_internal+0x138/0x1ec
[<0>] driver_detach+0x6c/0xa8
[<0>] bus_remove_driver+0x78/0xa8
[<0>] driver_unregister+0x30/0x50
[<0>] sdio_unregister_driver+0x28/0x34
[<0>] cleanup_module+0x14/0x6bc [ath10k_sdio]
[<0>] __arm64_sys_delete_module+0x1e0/0x22c
[<0>] el0_svc_common+0xa4/0x154
[<0>] el0_svc_compat_handler+0x2c/0x38
[<0>] el0_svc_compat+0x8/0x18
[<0>] 0xffffffffffffffff

SNOC:
[  647.156863] Call trace:
[  647.162166] [<ffffff80080855a4>] __switch_to+0x120/0x13c
[  647.164512] [<ffffff800899d8b8>] __schedule+0x5ec/0x798
[  647.170062] [<ffffff800899dad8>] schedule+0x74/0x94
[  647.175050] [<ffffff80089a0848>] schedule_timeout+0x314/0x42c
[  647.179874] [<ffffff80089a0a14>] schedule_timeout_uninterruptible+0x34/0x40
[  647.185780] [<ffffff80082a494>] msleep+0x28/0x38
[  647.192546] [<ffffff800117ec4c>] ath10k_snoc_hif_stop+0x4c/0x1e0 [ath10k_snoc]
[  647.197439] [<ffffff80010dfbd8>] ath10k_core_stop+0x50/0x7c [ath10k_core]
[  647.204652] [<ffffff80010c8f48>] ath10k_halt+0x114/0x16c [ath10k_core]
[  647.211420] [<ffffff80010cad68>] ath10k_stop+0x4c/0x88 [ath10k_core]
[  647.217865] [<ffffff8000fdbf54>] drv_stop+0x110/0x244 [mac80211]
[  647.224367] [<ffffff80010147ac>] ieee80211_stop_device+0x48/0x54 [mac80211]
[  647.230359] [<ffffff8000ff3eec>] ieee80211_do_stop+0x6a4/0x73c [mac80211]
[  647.237033] [<ffffff8000ff4500>] ieee80211_stop+0x20/0x30 [mac80211]
[  647.243942] [<ffffff80087e39b8>] __dev_close_many+0xa0/0xfc
[  647.250435] [<ffffff80087e3888>] dev_close_many+0x70/0x100
[  647.255651] [<ffffff80087e3a60>] dev_close+0x4c/0x80
[  647.261244] [<ffffff8000f1ba54>] cfg80211_shutdown_all_interfaces+0x44/0xcc [cfg80211]
[  647.266383] [<ffffff8000ff3fdc>] ieee80211_remove_interfaces+0x58/0x1b4 [mac80211]
[  647.274128] [<ffffff8000fda540>] ieee80211_unregister_hw+0x50/0x120 [mac80211]
[  647.281659] [<ffffff80010ca314>] ath10k_mac_unregister+0x1c/0x44 [ath10k_core]
[  647.288839] [<ffffff80010dfc94>] ath10k_core_unregister+0x48/0x90 [ath10k_core]
[  647.296027] [<ffffff800117e598>] ath10k_snoc_remove+0x5c/0x150 [ath10k_snoc]
[  647.303229] [<ffffff80085625fc>] platform_drv_remove+0x28/0x50
[  647.310517] [<ffffff80085601a4>] device_release_driver_internal+0x114/0x1b8
[  647.316257] [<ffffff80085602e4>] driver_detach+0x6c/0xa8
[  647.323021] [<ffffff800855e5b8>] bus_remove_driver+0x78/0xa8
[  647.328571] [<ffffff800856107c>] driver_unregister+0x30/0x50
[  647.334213] [<ffffff8008562674>] platform_driver_unregister+0x1c/0x28
[  647.339876] [<ffffff800117fefc>] cleanup_module+0x1c/0x120 [ath10k_snoc]
[  647.346196] [<ffffff8008143ab8>] SyS_delete_module+0x1dc/0x22c

PCIe:
[  615.392770] rmmod           D    0  3523   3458 0x00000080
[  615.392777] Call Trace:
[  615.392784]  __schedule+0x617/0x7d3
[  615.392791]  ? __mod_timer+0x263/0x35c
[  615.392797]  schedule+0x62/0x72
[  615.392803]  schedule_timeout+0x8d/0xf3
[  615.392809]  ? run_local_timers+0x6b/0x6b
[  615.392814]  msleep+0x1b/0x22
[  615.392824]  ath10k_pci_hif_stop+0x68/0xd6 [ath10k_pci]
[  615.392844]  ath10k_core_stop+0x44/0x67 [ath10k_core]
[  615.392859]  ath10k_halt+0x102/0x153 [ath10k_core]
[  615.392873]  ath10k_stop+0x38/0x75 [ath10k_core]
[  615.392893]  drv_stop+0x9a/0x13c [mac80211]
[  615.392915]  ieee80211_do_stop+0x772/0x7cd [mac80211]
[  615.392937]  ieee80211_stop+0x1a/0x1e [mac80211]
[  615.392945]  __dev_close_many+0x9e/0xf0
[  615.392952]  dev_close_many+0x62/0xe8
[  615.392958]  dev_close+0x54/0x7d
[  615.392975]  cfg80211_shutdown_all_interfaces+0x6e/0xa5 [cfg80211]
[  615.393021]  ieee80211_remove_interfaces+0x52/0x1aa [mac80211]
[  615.393049]  ieee80211_unregister_hw+0x54/0x136 [mac80211]
[  615.393068]  ath10k_mac_unregister+0x19/0x4a [ath10k_core]
[  615.393091]  ath10k_core_unregister+0x39/0x7e [ath10k_core]
[  615.393104]  ath10k_pci_remove+0x3d/0x7f [ath10k_pci]
[  615.393117]  pci_device_remove+0x41/0xa6
[  615.393129]  device_release_driver_internal+0x123/0x1ec
[  615.393140]  driver_detach+0x60/0x90
[  615.393152]  bus_remove_driver+0x72/0x9f
[  615.393164]  pci_unregister_driver+0x1e/0x87
[  615.393177]  SyS_delete_module+0x1d7/0x277
[  615.393188]  do_syscall_64+0x6b/0xf7
[  615.393199]  entry_SYSCALL_64_after_hwframe+0x41/0xa6

The test command run simulate_fw_crash firstly and it call into
ath10k_sdio_hif_stop from ath10k_core_restart, then napi_disable
is called and bit NAPI_STATE_SCHED is set. After that, function
ath10k_sdio_hif_stop is called again from ath10k_stop by command
"ifconfig wlan0 down" or "rmmod ath10k_sdio", then command blocked.

It is blocked by napi_synchronize, napi_disable will set bit with
NAPI_STATE_SCHED, and then napi_synchronize will enter dead loop
becuase bit NAPI_STATE_SCHED is set by napi_disable.

function of napi_synchronize
static inline void napi_synchronize(const struct napi_struct *n)
{
	if (IS_ENABLED(CONFIG_SMP))
		while (test_bit(NAPI_STATE_SCHED, &n->state))
			msleep(1);
	else
		barrier();
}

function of napi_disable
void napi_disable(struct napi_struct *n)
{
	might_sleep();
	set_bit(NAPI_STATE_DISABLE, &n->state);

	while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
		msleep(1);
	while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state))
		msleep(1);

	hrtimer_cancel(&n->timer);

	clear_bit(NAPI_STATE_DISABLE, &n->state);
}

Add flag for it avoid the hang and crash.

Tested-on: QCA6174 hw3.2 SDIO WLAN.RMH.4.4.1-00049
Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00110-QCARMSWP-1
Tested-on: WCN3990 hw1.0 SNOC hw1.0 WLAN.HL.3.1-01307.1-QCAHLSWMTPL-2

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1598617348-2325-1-git-send-email-wgong@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Mar 12, 2021
This commit fixes the following checkpatch.pl warnings:

    WARNING: do not add new typedefs
    torvalds#84: FILE: include/rtw_mlme.h:84:
    +typedef enum _RT_SCAN_TYPE {

    WARNING: do not add new typedefs
    torvalds#137: FILE: include/rtw_mlme.h:137:
    +typedef struct _RT_LINK_DETECT_T {

Signed-off-by: Marco Cesati <marco.cesati@gmail.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Mar 13, 2021
This commit fixes the following checkpatch.pl warnings:

    WARNING: do not add new typedefs
    torvalds#84: FILE: include/rtw_mlme.h:84:
    +typedef enum _RT_SCAN_TYPE {

    WARNING: do not add new typedefs
    torvalds#137: FILE: include/rtw_mlme.h:137:
    +typedef struct _RT_LINK_DETECT_T {

Signed-off-by: Marco Cesati <marco.cesati@gmail.com>
Link: https://lore.kernel.org/r/20210312082638.25512-3-marco.cesati@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Mar 15, 2021
This commit fixes the following checkpatch.pl errors:

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#132: FILE: ./hal/HalBtc8723b2Ant.h:132:
    +void EXhalbtc8723b2ant_PowerOnSetting(struct BTC_COEXIST * pBtCoexist);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#133: FILE: ./hal/HalBtc8723b2Ant.h:133:
    +void EXhalbtc8723b2ant_InitHwConfig(struct BTC_COEXIST * pBtCoexist, bool bWifiOnly);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#134: FILE: ./hal/HalBtc8723b2Ant.h:134:
    +void EXhalbtc8723b2ant_InitCoexDm(struct BTC_COEXIST * pBtCoexist);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#135: FILE: ./hal/HalBtc8723b2Ant.h:135:
    +void EXhalbtc8723b2ant_IpsNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#136: FILE: ./hal/HalBtc8723b2Ant.h:136:
    +void EXhalbtc8723b2ant_LpsNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#137: FILE: ./hal/HalBtc8723b2Ant.h:137:
    +void EXhalbtc8723b2ant_ScanNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#138: FILE: ./hal/HalBtc8723b2Ant.h:138:
    +void EXhalbtc8723b2ant_ConnectNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#139: FILE: ./hal/HalBtc8723b2Ant.h:139:
    +void EXhalbtc8723b2ant_MediaStatusNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#140: FILE: ./hal/HalBtc8723b2Ant.h:140:
    +void EXhalbtc8723b2ant_SpecialPacketNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#142: FILE: ./hal/HalBtc8723b2Ant.h:142:
    +	struct BTC_COEXIST * pBtCoexist, u8 *tmpBuf, u8 length

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#144: FILE: ./hal/HalBtc8723b2Ant.h:144:
    +void EXhalbtc8723b2ant_HaltNotify(struct BTC_COEXIST * pBtCoexist);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#145: FILE: ./hal/HalBtc8723b2Ant.h:145:
    +void EXhalbtc8723b2ant_PnpNotify(struct BTC_COEXIST * pBtCoexist, u8 pnpState);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#146: FILE: ./hal/HalBtc8723b2Ant.h:146:
    +void EXhalbtc8723b2ant_Periodical(struct BTC_COEXIST * pBtCoexist);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#147: FILE: ./hal/HalBtc8723b2Ant.h:147:
    +void EXhalbtc8723b2ant_DisplayCoexInfo(struct BTC_COEXIST * pBtCoexist);

Signed-off-by: Marco Cesati <marcocesati@gmail.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Mar 16, 2021
This commit fixes the following checkpatch.pl errors:

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#132: FILE: ./hal/HalBtc8723b2Ant.h:132:
    +void EXhalbtc8723b2ant_PowerOnSetting(struct BTC_COEXIST * pBtCoexist);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#133: FILE: ./hal/HalBtc8723b2Ant.h:133:
    +void EXhalbtc8723b2ant_InitHwConfig(struct BTC_COEXIST * pBtCoexist, bool bWifiOnly);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#134: FILE: ./hal/HalBtc8723b2Ant.h:134:
    +void EXhalbtc8723b2ant_InitCoexDm(struct BTC_COEXIST * pBtCoexist);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#135: FILE: ./hal/HalBtc8723b2Ant.h:135:
    +void EXhalbtc8723b2ant_IpsNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#136: FILE: ./hal/HalBtc8723b2Ant.h:136:
    +void EXhalbtc8723b2ant_LpsNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#137: FILE: ./hal/HalBtc8723b2Ant.h:137:
    +void EXhalbtc8723b2ant_ScanNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#138: FILE: ./hal/HalBtc8723b2Ant.h:138:
    +void EXhalbtc8723b2ant_ConnectNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#139: FILE: ./hal/HalBtc8723b2Ant.h:139:
    +void EXhalbtc8723b2ant_MediaStatusNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#140: FILE: ./hal/HalBtc8723b2Ant.h:140:
    +void EXhalbtc8723b2ant_SpecialPacketNotify(struct BTC_COEXIST * pBtCoexist, u8 type);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#142: FILE: ./hal/HalBtc8723b2Ant.h:142:
    +	struct BTC_COEXIST * pBtCoexist, u8 *tmpBuf, u8 length

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#144: FILE: ./hal/HalBtc8723b2Ant.h:144:
    +void EXhalbtc8723b2ant_HaltNotify(struct BTC_COEXIST * pBtCoexist);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#145: FILE: ./hal/HalBtc8723b2Ant.h:145:
    +void EXhalbtc8723b2ant_PnpNotify(struct BTC_COEXIST * pBtCoexist, u8 pnpState);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#146: FILE: ./hal/HalBtc8723b2Ant.h:146:
    +void EXhalbtc8723b2ant_Periodical(struct BTC_COEXIST * pBtCoexist);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#147: FILE: ./hal/HalBtc8723b2Ant.h:147:
    +void EXhalbtc8723b2ant_DisplayCoexInfo(struct BTC_COEXIST * pBtCoexist);

Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marco Cesati <marcocesati@gmail.com>
Link: https://lore.kernel.org/r/20210315170618.2566-6-marcocesati@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ojeda added a commit to ojeda/linux that referenced this pull request Mar 21, 2021
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jun 20, 2021
When 'mtip_block_initialize' fails at 'mtip_hw_get_identify', a series
of cleanup operations will be performed. But when the execution reaches
'put_disk', it will cause refcount underflow. The reason for this error
is that after cleaning 'dd->queue', 'dd->disk->queue' was not set to
null at the same time, which caused repeated cleanup work.

Fix this by set 'dd->disk->queue' to null after cleaning

This log reveals it:

[   59.590163] refcount_t: underflow; use-after-free.
[   59.591650] Modules linked in:
[   59.591867] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.12.4-g70e7f0549188-dirty torvalds#137
[   59.592407] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[   59.593178] RIP: 0010:refcount_warn_saturate+0x140/0x150
[   59.593551] Code: 05 d1 3b dd 04 01 e8 af d4 5f ff 0f 0b e9 13 ff ff ff e8 b3 75 73 ff 48 c7 c7 30 31 df 85 c6 05 b4 3b dd 04 01 e8 90 d4 5f ff <0f> 0b e9 f4 fe ff ff 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55
[   59.594942] RSP: 0000:ffffc90000017918 EFLAGS: 00010286
[   59.595357] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
[   59.595858] RDX: 0000000000000000 RSI: ffffffff8123f301 RDI: 00000000ffffffff
[   59.596346] RBP: ffffc90000017928 R08: 0000000000000001 R09: 0000000000000001
[   59.596926] R10: 0000000000000000 R11: 0000000000000001 R12: ffff888105494270
[   59.597429] R13: ffff888105494270 R14: ffffffff82498b30 R15: 0000000000000000
[   59.597931] FS:  0000000000000000(0000) GS:ffff88817bd40000(0000) knlGS:0000000000000000
[   59.598500] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.598899] CR2: 0000000000000000 CR3: 000000000642e000 CR4: 00000000000006e0
[   59.599401] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.599900] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.600400] Call Trace:
[   59.600579]  kobject_put+0x1b0/0x2e0
[   59.600839]  ? dev_attr_show+0x70/0x70
[   59.601112]  blk_put_queue+0x19/0x20
[   59.601372]  disk_release+0xb7/0xf0
[   59.601628]  ? show_partition_start+0x80/0x80
[   59.601943]  device_release+0x40/0xd0
[   59.602207]  kobject_put+0x10b/0x2e0
[   59.602468]  put_device+0x1f/0x30
[   59.602708]  put_disk+0x2a/0x40
[   59.602938]  mtip_block_initialize+0x35f/0x1570
[   59.603264]  ? __pci_enable_msi_range+0x32c/0x470
[   59.603606]  mtip_pci_probe+0x92a/0xc80
[   59.603899]  local_pci_probe+0x4a/0xb0
[   59.604173]  pci_device_probe+0x126/0x1d0
[   59.604478]  ? pci_device_remove+0x100/0x100
[   59.604790]  really_probe+0x27e/0x650
[   59.605059]  driver_probe_device+0x84/0x1d0
[   59.605359]  ? mutex_lock_nested+0x16/0x20
[   59.605660]  device_driver_attach+0x63/0x70
[   59.605963]  __driver_attach+0x117/0x1a0
[   59.606247]  ? device_driver_attach+0x70/0x70
[   59.606607]  bus_for_each_dev+0xb6/0x110
[   59.606919]  ? rdinit_setup+0x40/0x40
[   59.607177]  driver_attach+0x22/0x30
[   59.607431]  bus_add_driver+0x1e6/0x2a0
[   59.607703]  driver_register+0xa4/0x180
[   59.607974]  __pci_register_driver+0x77/0x80
[   59.608273]  ? drbd_debugfs_init+0x78/0x78
[   59.608560]  mtip_init+0x15c/0x18f
[   59.608820]  do_one_initcall+0x7a/0x3d0
[   59.609140]  ? rdinit_setup+0x40/0x40
[   59.609464]  ? rcu_read_lock_sched_held+0x4a/0x70
[   59.609879]  kernel_init_freeable+0x2a7/0x2f9
[   59.610268]  ? rest_init+0x2c0/0x2c0
[   59.610561]  kernel_init+0x13/0x180
[   59.610807]  ? rest_init+0x2c0/0x2c0
[   59.611058]  ? rest_init+0x2c0/0x2c0
[   59.611312]  ret_from_fork+0x1f/0x30
[   59.611574] Kernel panic - not syncing: panic_on_warn set ...
[   59.611973] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.12.4-g70e7f0549188-dirty torvalds#137
[   59.612514] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[   59.613303] Call Trace:
[   59.613476]  dump_stack+0xba/0xf5
[   59.613718]  ? refcount_warn_saturate+0x140/0x150
[   59.614055]  panic+0x155/0x3ed
[   59.614281]  ? __warn+0xed/0x150
[   59.614468]  ? refcount_warn_saturate+0x140/0x150
[   59.614468]  __warn+0x103/0x150
[   59.614468]  ? refcount_warn_saturate+0x140/0x150
[   59.614468]  report_bug+0x119/0x1c0
[   59.614468]  handle_bug+0x3b/0x80
[   59.614468]  exc_invalid_op+0x18/0x70
[   59.614468]  asm_exc_invalid_op+0x12/0x20
[   59.614468] RIP: 0010:refcount_warn_saturate+0x140/0x150
[   59.614468] Code: 05 d1 3b dd 04 01 e8 af d4 5f ff 0f 0b e9 13 ff ff ff e8 b3 75 73 ff 48 c7 c7 30 31 df 85 c6 05 b4 3b dd 04 01 e8 90 d4 5f ff <0f> 0b e9 f4 fe ff ff 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55
[   59.614468] RSP: 0000:ffffc90000017918 EFLAGS: 00010286
[   59.614468] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
[   59.614468] RDX: 0000000000000000 RSI: ffffffff8123f301 RDI: 00000000ffffffff
[   59.614468] RBP: ffffc90000017928 R08: 0000000000000001 R09: 0000000000000001
[   59.614468] R10: 0000000000000000 R11: 0000000000000001 R12: ffff888105494270
[   59.614468] R13: ffff888105494270 R14: ffffffff82498b30 R15: 0000000000000000
[   59.614468]  ? dev_attr_show+0x70/0x70
[   59.614468]  ? vprintk_func+0x71/0x110
[   59.614468]  ? refcount_warn_saturate+0x140/0x150
[   59.614468]  kobject_put+0x1b0/0x2e0
[   59.614468]  ? dev_attr_show+0x70/0x70
[   59.614468]  blk_put_queue+0x19/0x20
[   59.614468]  disk_release+0xb7/0xf0
[   59.614468]  ? show_partition_start+0x80/0x80
[   59.614468]  device_release+0x40/0xd0
[   59.614468]  kobject_put+0x10b/0x2e0
[   59.614468]  put_device+0x1f/0x30
[   59.614468]  put_disk+0x2a/0x40
[   59.614468]  mtip_block_initialize+0x35f/0x1570
[   59.614468]  ? __pci_enable_msi_range+0x32c/0x470
[   59.614468]  mtip_pci_probe+0x92a/0xc80
[   59.614468]  local_pci_probe+0x4a/0xb0
[   59.614468]  pci_device_probe+0x126/0x1d0
[   59.614468]  ? pci_device_remove+0x100/0x100
[   59.614468]  really_probe+0x27e/0x650
[   59.614468]  driver_probe_device+0x84/0x1d0
[   59.614468]  ? mutex_lock_nested+0x16/0x20
[   59.614468]  device_driver_attach+0x63/0x70
[   59.614468]  __driver_attach+0x117/0x1a0
[   59.614468]  ? device_driver_attach+0x70/0x70
[   59.614468]  bus_for_each_dev+0xb6/0x110
[   59.614468]  ? rdinit_setup+0x40/0x40
[   59.614468]  driver_attach+0x22/0x30
[   59.614468]  bus_add_driver+0x1e6/0x2a0
[   59.614468]  driver_register+0xa4/0x180
[   59.614468]  __pci_register_driver+0x77/0x80
[   59.614468]  ? drbd_debugfs_init+0x78/0x78
[   59.614468]  mtip_init+0x15c/0x18f
[   59.614468]  do_one_initcall+0x7a/0x3d0
[   59.614468]  ? rdinit_setup+0x40/0x40
[   59.614468]  ? rcu_read_lock_sched_held+0x4a/0x70
[   59.614468]  kernel_init_freeable+0x2a7/0x2f9
[   59.614468]  ? rest_init+0x2c0/0x2c0
[   59.614468]  kernel_init+0x13/0x180
[   59.614468]  ? rest_init+0x2c0/0x2c0
[   59.614468]  ? rest_init+0x2c0/0x2c0
[   59.614468]  ret_from_fork+0x1f/0x30
[   59.614468] Dumping ftrace buffer:
[   59.614468]    (ftrace buffer empty)
[   59.614468] Kernel Offset: disabled
[   59.614468] Rebooting in 1 seconds..

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Sep 6, 2021
We queue an irq work for deferred processing of mce event
in realmode mce handler, where translation is disabled.
Queuing of the work may result in accessing memory outside
RMO region, such access needs the translation to be enabled
for an LPAR running with hash mmu else the kernel crashes.

So enable the translation before queuing the work.

Without this change following trace is seen on injecting machine
check error in an LPAR running with hash mmu.

Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 5 PID: 1883 Comm: insmod Tainted: G        OE     5.14.0-mce+ torvalds#137
NIP:  c000000000735d60 LR: c000000000318640 CTR: 0000000000000000
REGS: c00000001ebff9a0 TRAP: 0300   Tainted: G       OE      (5.14.0-mce+)
MSR:  8000000000001003 <SF,ME,RI,LE>  CR: 28008228  XER: 00000001
CFAR: c00000000031863c DAR: c00000027fa8fe08 DSISR: 40000000 IRQMASK: 0
GPR00: c0000000003186d0 c00000001ebffc40 c000000001b0df00 c0000000016337e8
GPR04: c0000000016337e8 c00000027fa8fe08 0000000000000023 c0000000016337f0
GPR08: 0000000000000023 c0000000012ffe08 0000000000000000 c008000001460240
GPR12: 0000000000000000 c00000001ec9a900 c00000002ac4bd00 0000000000000000
GPR16: 00000000000005a0 c0080000006b0000 c0080000006b05a0 c000000000ff3068
GPR20: c00000002ac4bbc0 0000000000000001 c00000002ac4bbc0 c008000001490298
GPR24: c008000001490108 c000000001636198 c008000001470090 c008000001470058
GPR28: 0000000000000510 c008000001000000 c008000008000019 0000000000000019
NIP [c000000000735d60] llist_add_batch+0x0/0x40
LR [c000000000318640] __irq_work_queue_local+0x70/0xc0
Call Trace:
[c00000001ebffc40] [c00000001ebffc0c] 0xc00000001ebffc0c (unreliable)
[c00000001ebffc60] [c0000000003186d0] irq_work_queue+0x40/0x70
[c00000001ebffc80] [c00000000004425c] machine_check_queue_event+0xbc/0xd0
[c00000001ebffcf0] [c00000000000838c] machine_check_early_common+0x16c/0x1f4

Fixes: 74c3354 ("powerpc/pseries/mce: restore msr before returning from handler")
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Sep 9, 2021
We queue an irq work for deferred processing of mce event
in realmode mce handler, where translation is disabled.
Queuing of the work may result in accessing memory outside
RMO region, such access needs the translation to be enabled
for an LPAR running with hash mmu else the kernel crashes.

After enabling translation in mce_handle_error() we used to
leave it enabled to avoid crashing here, but now with the
commit 74c3354 ("powerpc/pseries/mce: restore msr before
returning from handler") we are restoring the MSR to disable
translation.

Hence to fix this enable the translation before queuing the work.

Without this change following trace is seen on injecting SLB
multihit in an LPAR running with hash mmu.

Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 5 PID: 1883 Comm: insmod Tainted: G        OE     5.14.0-mce+ torvalds#137
NIP:  c000000000735d60 LR: c000000000318640 CTR: 0000000000000000
REGS: c00000001ebff9a0 TRAP: 0300   Tainted: G       OE      (5.14.0-mce+)
MSR:  8000000000001003 <SF,ME,RI,LE>  CR: 28008228  XER: 00000001
CFAR: c00000000031863c DAR: c00000027fa8fe08 DSISR: 40000000 IRQMASK: 0
GPR00: c0000000003186d0 c00000001ebffc40 c000000001b0df00 c0000000016337e8
GPR04: c0000000016337e8 c00000027fa8fe08 0000000000000023 c0000000016337f0
GPR08: 0000000000000023 c0000000012ffe08 0000000000000000 c008000001460240
GPR12: 0000000000000000 c00000001ec9a900 c00000002ac4bd00 0000000000000000
GPR16: 00000000000005a0 c0080000006b0000 c0080000006b05a0 c000000000ff3068
GPR20: c00000002ac4bbc0 0000000000000001 c00000002ac4bbc0 c008000001490298
GPR24: c008000001490108 c000000001636198 c008000001470090 c008000001470058
GPR28: 0000000000000510 c008000001000000 c008000008000019 0000000000000019
NIP [c000000000735d60] llist_add_batch+0x0/0x40
LR [c000000000318640] __irq_work_queue_local+0x70/0xc0
Call Trace:
[c00000001ebffc40] [c00000001ebffc0c] 0xc00000001ebffc0c (unreliable)
[c00000001ebffc60] [c0000000003186d0] irq_work_queue+0x40/0x70
[c00000001ebffc80] [c00000000004425c] machine_check_queue_event+0xbc/0xd0
[c00000001ebffcf0] [c00000000000838c] machine_check_early_common+0x16c/0x1f4

Fixes: 74c3354 ("powerpc/pseries/mce: restore msr before returning from handler")
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
ruscur pushed a commit to ruscur/linux that referenced this pull request Sep 16, 2021
We queue an irq work for deferred processing of mce event in realmode
mce handler, where translation is disabled. Queuing of the work may
result in accessing memory outside RMO region, such access needs the
translation to be enabled for an LPAR running with hash mmu else the
kernel crashes.

After enabling translation in mce_handle_error() we used to leave it
enabled to avoid crashing here, but now with the commit
74c3354 ("powerpc/pseries/mce: restore msr before returning from
handler") we are restoring the MSR to disable translation.

Hence to fix this enable the translation before queuing the work.

Without this change following trace is seen on injecting SLB multihit in
an LPAR running with hash mmu.

  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  CPU: 5 PID: 1883 Comm: insmod Tainted: G        OE     5.14.0-mce+ torvalds#137
  NIP:  c000000000735d60 LR: c000000000318640 CTR: 0000000000000000
  REGS: c00000001ebff9a0 TRAP: 0300   Tainted: G       OE      (5.14.0-mce+)
  MSR:  8000000000001003 <SF,ME,RI,LE>  CR: 28008228  XER: 00000001
  CFAR: c00000000031863c DAR: c00000027fa8fe08 DSISR: 40000000 IRQMASK: 0
  ...
  NIP llist_add_batch+0x0/0x40
  LR  __irq_work_queue_local+0x70/0xc0
  Call Trace:
    0xc00000001ebffc0c (unreliable)
    irq_work_queue+0x40/0x70
    machine_check_queue_event+0xbc/0xd0
    machine_check_early_common+0x16c/0x1f4

Fixes: 74c3354 ("powerpc/pseries/mce: restore msr before returning from handler")
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
[mpe: Fix comment formatting, trim oops in change log for readability]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210909064330.312432-1-ganeshgr@linux.ibm.com
intersectRaven pushed a commit to intersectRaven/linux that referenced this pull request Sep 22, 2021
commit 3a1e92d upstream.

We queue an irq work for deferred processing of mce event in realmode
mce handler, where translation is disabled. Queuing of the work may
result in accessing memory outside RMO region, such access needs the
translation to be enabled for an LPAR running with hash mmu else the
kernel crashes.

After enabling translation in mce_handle_error() we used to leave it
enabled to avoid crashing here, but now with the commit
74c3354 ("powerpc/pseries/mce: restore msr before returning from
handler") we are restoring the MSR to disable translation.

Hence to fix this enable the translation before queuing the work.

Without this change following trace is seen on injecting SLB multihit in
an LPAR running with hash mmu.

  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  CPU: 5 PID: 1883 Comm: insmod Tainted: G        OE     5.14.0-mce+ torvalds#137
  NIP:  c000000000735d60 LR: c000000000318640 CTR: 0000000000000000
  REGS: c00000001ebff9a0 TRAP: 0300   Tainted: G       OE      (5.14.0-mce+)
  MSR:  8000000000001003 <SF,ME,RI,LE>  CR: 28008228  XER: 00000001
  CFAR: c00000000031863c DAR: c00000027fa8fe08 DSISR: 40000000 IRQMASK: 0
  ...
  NIP llist_add_batch+0x0/0x40
  LR  __irq_work_queue_local+0x70/0xc0
  Call Trace:
    0xc00000001ebffc0c (unreliable)
    irq_work_queue+0x40/0x70
    machine_check_queue_event+0xbc/0xd0
    machine_check_early_common+0x16c/0x1f4

Fixes: 74c3354 ("powerpc/pseries/mce: restore msr before returning from handler")
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
[mpe: Fix comment formatting, trim oops in change log for readability]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210909064330.312432-1-ganeshgr@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Oct 9, 2021
On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ torvalds#137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
staging-kernelci-org pushed a commit to kernelci/linux that referenced this pull request Oct 11, 2021
On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ torvalds#137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211004024935.15326-1-shawn.guo@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
roxell pushed a commit to roxell/linux that referenced this pull request Oct 13, 2021
On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ torvalds#137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211004024935.15326-1-shawn.guo@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
asheplyakov pushed a commit to altlinux/linux-arm that referenced this pull request Nov 3, 2021
commit 4217d07 upstream.

On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ torvalds#137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211004024935.15326-1-shawn.guo@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
staging-kernelci-org pushed a commit to kernelci/linux that referenced this pull request Nov 3, 2021
commit 4217d07 upstream.

On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ torvalds#137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211004024935.15326-1-shawn.guo@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
krzk pushed a commit to krzk/linux that referenced this pull request Nov 15, 2021
commit 4217d07 upstream.

On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ torvalds#137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211004024935.15326-1-shawn.guo@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Nov 20, 2021
commit 4217d07 upstream.

On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ torvalds#137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211004024935.15326-1-shawn.guo@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Nov 20, 2021
commit 4217d07 upstream.

On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ torvalds#137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211004024935.15326-1-shawn.guo@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Nov 20, 2021
commit 4217d07 upstream.

On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ torvalds#137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211004024935.15326-1-shawn.guo@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 3, 2022
Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
In commit cd6f79d ("xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks") changed the order of running callbacks
and wait for iclog completion to avoid unmount path untimely destroy AIL.
But which seems not enough to ensue this, adding mdelay in
`xfs_buf_item_unpin` can prove that.

The reproduction is as follows. To ensure destroy AIL safely,
we should wait all xlog ioend workers done and sync the AIL.

==================================================================
BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43

CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G        W
6.1.0-rc1-00002-gc28266863c4a torvalds#137
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-log/sda xlog_ioend_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report+0x171/0x4a6
 kasan_report+0xb3/0x130
 xfs_trans_ail_delete+0x240/0x2a0
 xfs_buf_item_done+0x7b/0xa0
 xfs_buf_ioend+0x1e9/0x11f0
 xfs_buf_item_unpin+0x4c8/0x860
 xfs_trans_committed_bulk+0x4c2/0x7c0
 xlog_cil_committed+0xab6/0xfb0
 xlog_cil_process_committed+0x117/0x1e0
 xlog_state_shutdown_callbacks+0x208/0x440
 xlog_force_shutdown+0x1b3/0x3a0
 xlog_ioend_work+0xef/0x1d0
 process_one_work+0x6f9/0xf70
 worker_thread+0x578/0xf30
 kthread+0x28c/0x330
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 9606:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x7a/0x90
 __kmalloc+0x59/0x140
 kmem_alloc+0xb2/0x2f0
 xfs_trans_ail_init+0x20/0x320
 xfs_log_mount+0x37e/0x690
 xfs_mountfs+0xe36/0x1b40
 xfs_fs_fill_super+0xc5c/0x1a70
 get_tree_bdev+0x3c5/0x6c0
 vfs_get_tree+0x85/0x250
 path_mount+0xec3/0x1830
 do_mount+0xef/0x110
 __x64_sys_mount+0x150/0x1f0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 9662:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x40
 __kasan_slab_free+0x105/0x1a0
 __kmem_cache_free+0x99/0x2d0
 kvfree+0x3a/0x40
 xfs_log_unmount+0x60/0xf0
 xfs_unmountfs+0xf3/0x1d0
 xfs_fs_put_super+0x78/0x300
 generic_shutdown_super+0x151/0x400
 kill_block_super+0x9a/0xe0
 deactivate_locked_super+0x82/0xe0
 deactivate_super+0x91/0xb0
 cleanup_mnt+0x32a/0x4a0
 task_work_run+0x15f/0x240
 exit_to_user_mode_prepare+0x188/0x190
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x42/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888023169400
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 0 bytes inside of
 128-byte region [ffff888023169400, ffff888023169480)

The buggy address belongs to the physical page:
page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff888023168f80 pfn:0x23168
head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint

Fixes: cd6f79d ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
akiernan pushed a commit to zuma-array/linux that referenced this pull request Nov 3, 2022
PD#150078: driver defect clean up:
torvalds#14
torvalds#89
torvalds#111
torvalds#124
torvalds#133
torvalds#136
torvalds#137
torvalds#146
torvalds#148
torvalds#150
torvalds#153

Change-Id: I734a66a8b92a0dc57a232879463a3fc074534fa0
Signed-off-by: Zongdong Jiao <zongdong.jiao@amlogic.com>
akiernan pushed a commit to zuma-array/linux that referenced this pull request Nov 4, 2022
PD#150078: driver defect clean up:
torvalds#14
torvalds#89
torvalds#111
torvalds#124
torvalds#133
torvalds#136
torvalds#137
torvalds#146
torvalds#148
torvalds#150
torvalds#153

Change-Id: I734a66a8b92a0dc57a232879463a3fc074534fa0
Signed-off-by: Zongdong Jiao <zongdong.jiao@amlogic.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Dec 1, 2022
Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
In commit cd6f79d ("xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks") changed the order of running callbacks
and wait for iclog completion to avoid unmount path untimely destroy AIL.
But which seems not enough to ensue this, adding mdelay in
`xfs_buf_item_unpin` can prove that.

The reproduction is as follows. To ensure destroy AIL safely,
we should wait all xlog ioend workers done and sync the AIL.

==================================================================
BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43

CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G        W
6.1.0-rc1-00002-gc28266863c4a torvalds#137
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-log/sda xlog_ioend_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report+0x171/0x4a6
 kasan_report+0xb3/0x130
 xfs_trans_ail_delete+0x240/0x2a0
 xfs_buf_item_done+0x7b/0xa0
 xfs_buf_ioend+0x1e9/0x11f0
 xfs_buf_item_unpin+0x4c8/0x860
 xfs_trans_committed_bulk+0x4c2/0x7c0
 xlog_cil_committed+0xab6/0xfb0
 xlog_cil_process_committed+0x117/0x1e0
 xlog_state_shutdown_callbacks+0x208/0x440
 xlog_force_shutdown+0x1b3/0x3a0
 xlog_ioend_work+0xef/0x1d0
 process_one_work+0x6f9/0xf70
 worker_thread+0x578/0xf30
 kthread+0x28c/0x330
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 9606:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x7a/0x90
 __kmalloc+0x59/0x140
 kmem_alloc+0xb2/0x2f0
 xfs_trans_ail_init+0x20/0x320
 xfs_log_mount+0x37e/0x690
 xfs_mountfs+0xe36/0x1b40
 xfs_fs_fill_super+0xc5c/0x1a70
 get_tree_bdev+0x3c5/0x6c0
 vfs_get_tree+0x85/0x250
 path_mount+0xec3/0x1830
 do_mount+0xef/0x110
 __x64_sys_mount+0x150/0x1f0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 9662:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x40
 __kasan_slab_free+0x105/0x1a0
 __kmem_cache_free+0x99/0x2d0
 kvfree+0x3a/0x40
 xfs_log_unmount+0x60/0xf0
 xfs_unmountfs+0xf3/0x1d0
 xfs_fs_put_super+0x78/0x300
 generic_shutdown_super+0x151/0x400
 kill_block_super+0x9a/0xe0
 deactivate_locked_super+0x82/0xe0
 deactivate_super+0x91/0xb0
 cleanup_mnt+0x32a/0x4a0
 task_work_run+0x15f/0x240
 exit_to_user_mode_prepare+0x188/0x190
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x42/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888023169400
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 0 bytes inside of
 128-byte region [ffff888023169400, ffff888023169480)

The buggy address belongs to the physical page:
page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff888023168f80 pfn:0x23168
head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint

Fixes: cd6f79d ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
ptr1337 pushed a commit to CachyOS/linux that referenced this pull request Dec 13, 2022
Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
In commit cd6f79d ("xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks") changed the order of running callbacks
and wait for iclog completion to avoid unmount path untimely destroy AIL.
But which seems not enough to ensue this, adding mdelay in
`xfs_buf_item_unpin` can prove that.

The reproduction is as follows. To ensure destroy AIL safely,
we should wait all xlog ioend workers done and sync the AIL.

==================================================================
BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43

CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G        W
6.1.0-rc1-00002-gc28266863c4a torvalds#137
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-log/sda xlog_ioend_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report+0x171/0x4a6
 kasan_report+0xb3/0x130
 xfs_trans_ail_delete+0x240/0x2a0
 xfs_buf_item_done+0x7b/0xa0
 xfs_buf_ioend+0x1e9/0x11f0
 xfs_buf_item_unpin+0x4c8/0x860
 xfs_trans_committed_bulk+0x4c2/0x7c0
 xlog_cil_committed+0xab6/0xfb0
 xlog_cil_process_committed+0x117/0x1e0
 xlog_state_shutdown_callbacks+0x208/0x440
 xlog_force_shutdown+0x1b3/0x3a0
 xlog_ioend_work+0xef/0x1d0
 process_one_work+0x6f9/0xf70
 worker_thread+0x578/0xf30
 kthread+0x28c/0x330
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 9606:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x7a/0x90
 __kmalloc+0x59/0x140
 kmem_alloc+0xb2/0x2f0
 xfs_trans_ail_init+0x20/0x320
 xfs_log_mount+0x37e/0x690
 xfs_mountfs+0xe36/0x1b40
 xfs_fs_fill_super+0xc5c/0x1a70
 get_tree_bdev+0x3c5/0x6c0
 vfs_get_tree+0x85/0x250
 path_mount+0xec3/0x1830
 do_mount+0xef/0x110
 __x64_sys_mount+0x150/0x1f0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 9662:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x40
 __kasan_slab_free+0x105/0x1a0
 __kmem_cache_free+0x99/0x2d0
 kvfree+0x3a/0x40
 xfs_log_unmount+0x60/0xf0
 xfs_unmountfs+0xf3/0x1d0
 xfs_fs_put_super+0x78/0x300
 generic_shutdown_super+0x151/0x400
 kill_block_super+0x9a/0xe0
 deactivate_locked_super+0x82/0xe0
 deactivate_super+0x91/0xb0
 cleanup_mnt+0x32a/0x4a0
 task_work_run+0x15f/0x240
 exit_to_user_mode_prepare+0x188/0x190
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x42/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888023169400
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 0 bytes inside of
 128-byte region [ffff888023169400, ffff888023169480)

The buggy address belongs to the physical page:
page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff888023168f80 pfn:0x23168
head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint

Fixes: cd6f79d ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
ptr1337 pushed a commit to CachyOS/linux that referenced this pull request Dec 14, 2022
Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
In commit cd6f79d ("xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks") changed the order of running callbacks
and wait for iclog completion to avoid unmount path untimely destroy AIL.
But which seems not enough to ensue this, adding mdelay in
`xfs_buf_item_unpin` can prove that.

The reproduction is as follows. To ensure destroy AIL safely,
we should wait all xlog ioend workers done and sync the AIL.

==================================================================
BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43

CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G        W
6.1.0-rc1-00002-gc28266863c4a torvalds#137
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-log/sda xlog_ioend_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report+0x171/0x4a6
 kasan_report+0xb3/0x130
 xfs_trans_ail_delete+0x240/0x2a0
 xfs_buf_item_done+0x7b/0xa0
 xfs_buf_ioend+0x1e9/0x11f0
 xfs_buf_item_unpin+0x4c8/0x860
 xfs_trans_committed_bulk+0x4c2/0x7c0
 xlog_cil_committed+0xab6/0xfb0
 xlog_cil_process_committed+0x117/0x1e0
 xlog_state_shutdown_callbacks+0x208/0x440
 xlog_force_shutdown+0x1b3/0x3a0
 xlog_ioend_work+0xef/0x1d0
 process_one_work+0x6f9/0xf70
 worker_thread+0x578/0xf30
 kthread+0x28c/0x330
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 9606:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x7a/0x90
 __kmalloc+0x59/0x140
 kmem_alloc+0xb2/0x2f0
 xfs_trans_ail_init+0x20/0x320
 xfs_log_mount+0x37e/0x690
 xfs_mountfs+0xe36/0x1b40
 xfs_fs_fill_super+0xc5c/0x1a70
 get_tree_bdev+0x3c5/0x6c0
 vfs_get_tree+0x85/0x250
 path_mount+0xec3/0x1830
 do_mount+0xef/0x110
 __x64_sys_mount+0x150/0x1f0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 9662:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x40
 __kasan_slab_free+0x105/0x1a0
 __kmem_cache_free+0x99/0x2d0
 kvfree+0x3a/0x40
 xfs_log_unmount+0x60/0xf0
 xfs_unmountfs+0xf3/0x1d0
 xfs_fs_put_super+0x78/0x300
 generic_shutdown_super+0x151/0x400
 kill_block_super+0x9a/0xe0
 deactivate_locked_super+0x82/0xe0
 deactivate_super+0x91/0xb0
 cleanup_mnt+0x32a/0x4a0
 task_work_run+0x15f/0x240
 exit_to_user_mode_prepare+0x188/0x190
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x42/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888023169400
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 0 bytes inside of
 128-byte region [ffff888023169400, ffff888023169480)

The buggy address belongs to the physical page:
page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff888023168f80 pfn:0x23168
head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint

Fixes: cd6f79d ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
dakkshesh07 pushed a commit to Neutron-Projects/kernel-dejavu that referenced this pull request Dec 29, 2022
Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
In commit cd6f79d ("xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks") changed the order of running callbacks
and wait for iclog completion to avoid unmount path untimely destroy AIL.
But which seems not enough to ensue this, adding mdelay in
`xfs_buf_item_unpin` can prove that.

The reproduction is as follows. To ensure destroy AIL safely,
we should wait all xlog ioend workers done and sync the AIL.

==================================================================
BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43

CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G        W
6.1.0-rc1-00002-gc28266863c4a torvalds#137
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-log/sda xlog_ioend_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report+0x171/0x4a6
 kasan_report+0xb3/0x130
 xfs_trans_ail_delete+0x240/0x2a0
 xfs_buf_item_done+0x7b/0xa0
 xfs_buf_ioend+0x1e9/0x11f0
 xfs_buf_item_unpin+0x4c8/0x860
 xfs_trans_committed_bulk+0x4c2/0x7c0
 xlog_cil_committed+0xab6/0xfb0
 xlog_cil_process_committed+0x117/0x1e0
 xlog_state_shutdown_callbacks+0x208/0x440
 xlog_force_shutdown+0x1b3/0x3a0
 xlog_ioend_work+0xef/0x1d0
 process_one_work+0x6f9/0xf70
 worker_thread+0x578/0xf30
 kthread+0x28c/0x330
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 9606:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x7a/0x90
 __kmalloc+0x59/0x140
 kmem_alloc+0xb2/0x2f0
 xfs_trans_ail_init+0x20/0x320
 xfs_log_mount+0x37e/0x690
 xfs_mountfs+0xe36/0x1b40
 xfs_fs_fill_super+0xc5c/0x1a70
 get_tree_bdev+0x3c5/0x6c0
 vfs_get_tree+0x85/0x250
 path_mount+0xec3/0x1830
 do_mount+0xef/0x110
 __x64_sys_mount+0x150/0x1f0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 9662:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x40
 __kasan_slab_free+0x105/0x1a0
 __kmem_cache_free+0x99/0x2d0
 kvfree+0x3a/0x40
 xfs_log_unmount+0x60/0xf0
 xfs_unmountfs+0xf3/0x1d0
 xfs_fs_put_super+0x78/0x300
 generic_shutdown_super+0x151/0x400
 kill_block_super+0x9a/0xe0
 deactivate_locked_super+0x82/0xe0
 deactivate_super+0x91/0xb0
 cleanup_mnt+0x32a/0x4a0
 task_work_run+0x15f/0x240
 exit_to_user_mode_prepare+0x188/0x190
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x42/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888023169400
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 0 bytes inside of
 128-byte region [ffff888023169400, ffff888023169480)

The buggy address belongs to the physical page:
page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff888023168f80 pfn:0x23168
head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint

Fixes: cd6f79d ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dakkshesh <dakkshesh5@gmail.com>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Jul 30, 2023
mj22226 pushed a commit to mj22226/linux that referenced this pull request May 23, 2024
[ Upstream commit 1eb52a6 ]

Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
In commit cd6f79d ("xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks") changed the order of running callbacks
and wait for iclog completion to avoid unmount path untimely destroy AIL.
But which seems not enough to ensue this, adding mdelay in
`xfs_buf_item_unpin` can prove that.

The reproduction is as follows. To ensure destroy AIL safely,
we should wait all xlog ioend workers done and sync the AIL.

==================================================================
BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43

CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G        W
6.1.0-rc1-00002-gc28266863c4a torvalds#137
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-log/sda xlog_ioend_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report+0x171/0x4a6
 kasan_report+0xb3/0x130
 xfs_trans_ail_delete+0x240/0x2a0
 xfs_buf_item_done+0x7b/0xa0
 xfs_buf_ioend+0x1e9/0x11f0
 xfs_buf_item_unpin+0x4c8/0x860
 xfs_trans_committed_bulk+0x4c2/0x7c0
 xlog_cil_committed+0xab6/0xfb0
 xlog_cil_process_committed+0x117/0x1e0
 xlog_state_shutdown_callbacks+0x208/0x440
 xlog_force_shutdown+0x1b3/0x3a0
 xlog_ioend_work+0xef/0x1d0
 process_one_work+0x6f9/0xf70
 worker_thread+0x578/0xf30
 kthread+0x28c/0x330
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 9606:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x7a/0x90
 __kmalloc+0x59/0x140
 kmem_alloc+0xb2/0x2f0
 xfs_trans_ail_init+0x20/0x320
 xfs_log_mount+0x37e/0x690
 xfs_mountfs+0xe36/0x1b40
 xfs_fs_fill_super+0xc5c/0x1a70
 get_tree_bdev+0x3c5/0x6c0
 vfs_get_tree+0x85/0x250
 path_mount+0xec3/0x1830
 do_mount+0xef/0x110
 __x64_sys_mount+0x150/0x1f0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 9662:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x40
 __kasan_slab_free+0x105/0x1a0
 __kmem_cache_free+0x99/0x2d0
 kvfree+0x3a/0x40
 xfs_log_unmount+0x60/0xf0
 xfs_unmountfs+0xf3/0x1d0
 xfs_fs_put_super+0x78/0x300
 generic_shutdown_super+0x151/0x400
 kill_block_super+0x9a/0xe0
 deactivate_locked_super+0x82/0xe0
 deactivate_super+0x91/0xb0
 cleanup_mnt+0x32a/0x4a0
 task_work_run+0x15f/0x240
 exit_to_user_mode_prepare+0x188/0x190
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x42/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888023169400
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 0 bytes inside of
 128-byte region [ffff888023169400, ffff888023169480)

The buggy address belongs to the physical page:
page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff888023168f80 pfn:0x23168
head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint

Fixes: cd6f79d ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request May 25, 2024
[ Upstream commit 1eb52a6 ]

Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
In commit cd6f79d ("xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks") changed the order of running callbacks
and wait for iclog completion to avoid unmount path untimely destroy AIL.
But which seems not enough to ensue this, adding mdelay in
`xfs_buf_item_unpin` can prove that.

The reproduction is as follows. To ensure destroy AIL safely,
we should wait all xlog ioend workers done and sync the AIL.

==================================================================
BUG: KASAN: use-after-free in xfs_trans_ail_delete+0x240/0x2a0
Read of size 8 at addr ffff888023169400 by task kworker/1:1H/43

CPU: 1 PID: 43 Comm: kworker/1:1H Tainted: G        W
6.1.0-rc1-00002-gc28266863c4a torvalds#137
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: xfs-log/sda xlog_ioend_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report+0x171/0x4a6
 kasan_report+0xb3/0x130
 xfs_trans_ail_delete+0x240/0x2a0
 xfs_buf_item_done+0x7b/0xa0
 xfs_buf_ioend+0x1e9/0x11f0
 xfs_buf_item_unpin+0x4c8/0x860
 xfs_trans_committed_bulk+0x4c2/0x7c0
 xlog_cil_committed+0xab6/0xfb0
 xlog_cil_process_committed+0x117/0x1e0
 xlog_state_shutdown_callbacks+0x208/0x440
 xlog_force_shutdown+0x1b3/0x3a0
 xlog_ioend_work+0xef/0x1d0
 process_one_work+0x6f9/0xf70
 worker_thread+0x578/0xf30
 kthread+0x28c/0x330
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 9606:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x7a/0x90
 __kmalloc+0x59/0x140
 kmem_alloc+0xb2/0x2f0
 xfs_trans_ail_init+0x20/0x320
 xfs_log_mount+0x37e/0x690
 xfs_mountfs+0xe36/0x1b40
 xfs_fs_fill_super+0xc5c/0x1a70
 get_tree_bdev+0x3c5/0x6c0
 vfs_get_tree+0x85/0x250
 path_mount+0xec3/0x1830
 do_mount+0xef/0x110
 __x64_sys_mount+0x150/0x1f0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 9662:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x40
 __kasan_slab_free+0x105/0x1a0
 __kmem_cache_free+0x99/0x2d0
 kvfree+0x3a/0x40
 xfs_log_unmount+0x60/0xf0
 xfs_unmountfs+0xf3/0x1d0
 xfs_fs_put_super+0x78/0x300
 generic_shutdown_super+0x151/0x400
 kill_block_super+0x9a/0xe0
 deactivate_locked_super+0x82/0xe0
 deactivate_super+0x91/0xb0
 cleanup_mnt+0x32a/0x4a0
 task_work_run+0x15f/0x240
 exit_to_user_mode_prepare+0x188/0x190
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x42/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888023169400
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 0 bytes inside of
 128-byte region [ffff888023169400, ffff888023169480)

The buggy address belongs to the physical page:
page:ffffea00008c5a00 refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff888023168f80 pfn:0x23168
head:ffffea00008c5a00 order:1 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea00006b3988 ffffea0000577a88 ffff88800f842ac0
raw: ffff888023168f80 0000000000150007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888023169300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888023169400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff888023169480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888023169500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint

Fixes: cd6f79d ("xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks")
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant