Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add remote control modules to default config file #101

Closed
Jamstah opened this issue Sep 2, 2012 · 1 comment
Closed

Add remote control modules to default config file #101

Jamstah opened this issue Sep 2, 2012 · 1 comment

Comments

@Jamstah
Copy link

Jamstah commented Sep 2, 2012

As so many people are using the pi for a media center, could we add the remove controll modules (eg, mceusb) to the default config file for the kernel build?

Would involve turning all these to m:

# CONFIG_RC_ATI_REMOTE is not set
# CONFIG_IR_IMON is not set
# CONFIG_IR_MCEUSB is not set
# CONFIG_IR_REDRAT3 is not set
# CONFIG_IR_STREAMZAP is not set
# CONFIG_RC_LOOPBACK is not set
@Jamstah
Copy link
Author

Jamstah commented Sep 4, 2012

Just spotted this was done

Thanks, will save me rolling my own in the future.

78f82bc

@Jamstah Jamstah closed this as completed Sep 4, 2012
popcornmix pushed a commit that referenced this issue Oct 13, 2012
We need to make sure that the USB serial driver we find
matches the USB driver whose probe we are currently
executing. Otherwise we will end up with USB serial
devices bound to the correct serial driver but wrong
USB driver.

An example of such cross-probing, where the usbserial_generic
USB driver has found the sierra serial driver:

May 29 18:26:15 nemi kernel: [ 4442.559246] usbserial_generic 4-4:1.0: Sierra USB modem converter detected
May 29 18:26:20 nemi kernel: [ 4447.556747] usbserial_generic 4-4:1.2: Sierra USB modem converter detected
May 29 18:26:25 nemi kernel: [ 4452.557288] usbserial_generic 4-4:1.3: Sierra USB modem converter detected

sysfs view of the same problem:

bjorn@nemi:~$ ls -l /sys/bus/usb/drivers/sierra/
total 0
--w------- 1 root root 4096 May 29 18:23 bind
lrwxrwxrwx 1 root root    0 May 29 18:23 module -> ../../../../module/usbserial
--w------- 1 root root 4096 May 29 18:23 uevent
--w------- 1 root root 4096 May 29 18:23 unbind
bjorn@nemi:~$ ls -l /sys/bus/usb-serial/drivers/sierra/
total 0
--w------- 1 root root 4096 May 29 18:23 bind
lrwxrwxrwx 1 root root    0 May 29 18:23 module -> ../../../../module/sierra
-rw-r--r-- 1 root root 4096 May 29 18:23 new_id
lrwxrwxrwx 1 root root    0 May 29 18:32 ttyUSB0 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.0/ttyUSB0
lrwxrwxrwx 1 root root    0 May 29 18:32 ttyUSB1 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.2/ttyUSB1
lrwxrwxrwx 1 root root    0 May 29 18:32 ttyUSB2 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.3/ttyUSB2
--w------- 1 root root 4096 May 29 18:23 uevent
--w------- 1 root root 4096 May 29 18:23 unbind

bjorn@nemi:~$ ls -l /sys/bus/usb/drivers/usbserial_generic/
total 0
lrwxrwxrwx 1 root root    0 May 29 18:33 4-4:1.0 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.0
lrwxrwxrwx 1 root root    0 May 29 18:33 4-4:1.2 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.2
lrwxrwxrwx 1 root root    0 May 29 18:33 4-4:1.3 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.3
--w------- 1 root root 4096 May 29 18:33 bind
lrwxrwxrwx 1 root root    0 May 29 18:33 module -> ../../../../module/usbserial
--w------- 1 root root 4096 May 29 18:22 uevent
--w------- 1 root root 4096 May 29 18:33 unbind
bjorn@nemi:~$ ls -l /sys/bus/usb-serial/drivers/generic/
total 0
--w------- 1 root root 4096 May 29 18:33 bind
lrwxrwxrwx 1 root root    0 May 29 18:33 module -> ../../../../module/usbserial
-rw-r--r-- 1 root root 4096 May 29 18:33 new_id
--w------- 1 root root 4096 May 29 18:22 uevent
--w------- 1 root root 4096 May 29 18:33 unbind

So we end up with a mismatch between the USB driver and the
USB serial driver.  The reason for the above is simple: The
USB driver probe will succeed if *any* registered serial
driver matches, and will use that serial driver for all
serial driver functions.

This makes ref counting go wrong. We count the USB driver
as used, but not the USB serial driver.  This may result
in Oops'es as demonstrated by Johan Hovold <jhovold@gmail.com>:

[11811.646396] drivers/usb/serial/usb-serial.c: get_free_serial 1
[11811.646443] drivers/usb/serial/usb-serial.c: get_free_serial - minor base = 0
[11811.646460] drivers/usb/serial/usb-serial.c: usb_serial_probe - registering ttyUSB0
[11811.646766] usb 6-1: pl2303 converter now attached to ttyUSB0
[11812.264197] USB Serial deregistering driver FTDI USB Serial Device
[11812.264865] usbcore: deregistering interface driver ftdi_sio
[11812.282180] USB Serial deregistering driver pl2303
[11812.283141] pl2303 ttyUSB0: pl2303 converter now disconnected from ttyUSB0
[11812.283272] usbcore: deregistering interface driver pl2303
[11812.301056] USB Serial deregistering driver generic
[11812.301186] usbcore: deregistering interface driver usbserial_generic
[11812.301259] drivers/usb/serial/usb-serial.c: usb_serial_disconnect
[11812.301823] BUG: unable to handle kernel paging request at f8e7438c
[11812.301845] IP: [<f8e38445>] usb_serial_disconnect+0xb5/0x100 [usbserial]
[11812.301871] *pde = 357ef067 *pte = 00000000
[11812.301957] Oops: 0000 [#1] PREEMPT SMP
[11812.301983] Modules linked in: usbserial(-) [last unloaded: pl2303]
[11812.302008]
[11812.302019] Pid: 1323, comm: modprobe Tainted: G        W    3.4.0-rc7+ #101 Dell Inc. Vostro 1520/0T816J
[11812.302115] EIP: 0060:[<f8e38445>] EFLAGS: 00010246 CPU: 1
[11812.302130] EIP is at usb_serial_disconnect+0xb5/0x100 [usbserial]
[11812.302141] EAX: f508a180 EBX: f508a180 ECX: 00000000 EDX: f8e74300
[11812.302151] ESI: f5050800 EDI: 00000001 EBP: f5141e78 ESP: f5141e58
[11812.302160]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[11812.302170] CR0: 8005003b CR2: f8e7438c CR3: 34848000 CR4: 000007d0
[11812.302180] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[11812.302189] DR6: ffff0ff0 DR7: 00000400
[11812.302199] Process modprobe (pid: 1323, ti=f5140000 task=f61e2bc0 task.ti=f5140000)
[11812.302209] Stack:
[11812.302216]  f8e3be0f f8e3b29c f8e3ae00 00000000 f513641c f5136400 f513641c f507a540
[11812.302325]  f5141e98 c133d2c1 00000000 00000000 f509c400 f513641c f507a590 f5136450
[11812.302372]  f5141ea8 c12f0344 f513641c f507a590 f5141ebc c12f0c67 00000000 f507a590
[11812.302419] Call Trace:
[11812.302439]  [<c133d2c1>] usb_unbind_interface+0x51/0x190
[11812.302456]  [<c12f0344>] __device_release_driver+0x64/0xb0
[11812.302469]  [<c12f0c67>] driver_detach+0x97/0xa0
[11812.302483]  [<c12f001c>] bus_remove_driver+0x6c/0xe0
[11812.302500]  [<c145938d>] ? __mutex_unlock_slowpath+0xcd/0x140
[11812.302514]  [<c12f0ff9>] driver_unregister+0x49/0x80
[11812.302528]  [<c1457df6>] ? printk+0x1d/0x1f
[11812.302540]  [<c133c50d>] usb_deregister+0x5d/0xb0
[11812.302557]  [<f8e37c55>] ? usb_serial_deregister+0x45/0x50 [usbserial]
[11812.302575]  [<f8e37c8d>] usb_serial_deregister_drivers+0x2d/0x40 [usbserial]
[11812.302593]  [<f8e3a6e2>] usb_serial_generic_deregister+0x12/0x20 [usbserial]
[11812.302611]  [<f8e3acf0>] usb_serial_exit+0x8/0x32 [usbserial]
[11812.302716]  [<c1080b48>] sys_delete_module+0x158/0x260
[11812.302730]  [<c110594e>] ? mntput+0x1e/0x30
[11812.302746]  [<c145c3c3>] ? sysenter_exit+0xf/0x18
[11812.302746]  [<c107777c>] ? trace_hardirqs_on_caller+0xec/0x170
[11812.302746]  [<c145c390>] sysenter_do_call+0x12/0x36
[11812.302746] Code: 24 02 00 00 e8 dd f3 20 c8 f6 86 74 02 00 00 02 74 b4 8d 86 4c 02 00 00 47 e8 78 55 4b c8 0f b6 43 0e 39 f8 7f a9 8b 53 04 89 d8 <ff> 92 8c 00 00 00 89 d8 e8 0e ff ff ff 8b 45 f0 c7 44 24 04 2f
[11812.302746] EIP: [<f8e38445>] usb_serial_disconnect+0xb5/0x100 [usbserial] SS:ESP 0068:f5141e58
[11812.302746] CR2: 00000000f8e7438c

Fix by only evaluating serial drivers pointing back to the
USB driver we are currently probing.  This still allows two
or more drivers to match the same device, running their
serial driver probes to sort out which one to use.

Cc: stable@vger.kernel.org # 3.0, 3.2, 3.3, 3.4
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Tested-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ar0n pushed a commit to ar0n/linux that referenced this issue Nov 5, 2012
commit 954c3f8 upstream.

We need to make sure that the USB serial driver we find
matches the USB driver whose probe we are currently
executing. Otherwise we will end up with USB serial
devices bound to the correct serial driver but wrong
USB driver.

An example of such cross-probing, where the usbserial_generic
USB driver has found the sierra serial driver:

May 29 18:26:15 nemi kernel: [ 4442.559246] usbserial_generic 4-4:1.0: Sierra USB modem converter detected
May 29 18:26:20 nemi kernel: [ 4447.556747] usbserial_generic 4-4:1.2: Sierra USB modem converter detected
May 29 18:26:25 nemi kernel: [ 4452.557288] usbserial_generic 4-4:1.3: Sierra USB modem converter detected

sysfs view of the same problem:

bjorn@nemi:~$ ls -l /sys/bus/usb/drivers/sierra/
total 0
--w------- 1 root root 4096 May 29 18:23 bind
lrwxrwxrwx 1 root root    0 May 29 18:23 module -> ../../../../module/usbserial
--w------- 1 root root 4096 May 29 18:23 uevent
--w------- 1 root root 4096 May 29 18:23 unbind
bjorn@nemi:~$ ls -l /sys/bus/usb-serial/drivers/sierra/
total 0
--w------- 1 root root 4096 May 29 18:23 bind
lrwxrwxrwx 1 root root    0 May 29 18:23 module -> ../../../../module/sierra
-rw-r--r-- 1 root root 4096 May 29 18:23 new_id
lrwxrwxrwx 1 root root    0 May 29 18:32 ttyUSB0 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.0/ttyUSB0
lrwxrwxrwx 1 root root    0 May 29 18:32 ttyUSB1 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.2/ttyUSB1
lrwxrwxrwx 1 root root    0 May 29 18:32 ttyUSB2 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.3/ttyUSB2
--w------- 1 root root 4096 May 29 18:23 uevent
--w------- 1 root root 4096 May 29 18:23 unbind

bjorn@nemi:~$ ls -l /sys/bus/usb/drivers/usbserial_generic/
total 0
lrwxrwxrwx 1 root root    0 May 29 18:33 4-4:1.0 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.0
lrwxrwxrwx 1 root root    0 May 29 18:33 4-4:1.2 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.2
lrwxrwxrwx 1 root root    0 May 29 18:33 4-4:1.3 -> ../../../../devices/pci0000:00/0000:00:1d.7/usb4/4-4/4-4:1.3
--w------- 1 root root 4096 May 29 18:33 bind
lrwxrwxrwx 1 root root    0 May 29 18:33 module -> ../../../../module/usbserial
--w------- 1 root root 4096 May 29 18:22 uevent
--w------- 1 root root 4096 May 29 18:33 unbind
bjorn@nemi:~$ ls -l /sys/bus/usb-serial/drivers/generic/
total 0
--w------- 1 root root 4096 May 29 18:33 bind
lrwxrwxrwx 1 root root    0 May 29 18:33 module -> ../../../../module/usbserial
-rw-r--r-- 1 root root 4096 May 29 18:33 new_id
--w------- 1 root root 4096 May 29 18:22 uevent
--w------- 1 root root 4096 May 29 18:33 unbind

So we end up with a mismatch between the USB driver and the
USB serial driver.  The reason for the above is simple: The
USB driver probe will succeed if *any* registered serial
driver matches, and will use that serial driver for all
serial driver functions.

This makes ref counting go wrong. We count the USB driver
as used, but not the USB serial driver.  This may result
in Oops'es as demonstrated by Johan Hovold <jhovold@gmail.com>:

[11811.646396] drivers/usb/serial/usb-serial.c: get_free_serial 1
[11811.646443] drivers/usb/serial/usb-serial.c: get_free_serial - minor base = 0
[11811.646460] drivers/usb/serial/usb-serial.c: usb_serial_probe - registering ttyUSB0
[11811.646766] usb 6-1: pl2303 converter now attached to ttyUSB0
[11812.264197] USB Serial deregistering driver FTDI USB Serial Device
[11812.264865] usbcore: deregistering interface driver ftdi_sio
[11812.282180] USB Serial deregistering driver pl2303
[11812.283141] pl2303 ttyUSB0: pl2303 converter now disconnected from ttyUSB0
[11812.283272] usbcore: deregistering interface driver pl2303
[11812.301056] USB Serial deregistering driver generic
[11812.301186] usbcore: deregistering interface driver usbserial_generic
[11812.301259] drivers/usb/serial/usb-serial.c: usb_serial_disconnect
[11812.301823] BUG: unable to handle kernel paging request at f8e7438c
[11812.301845] IP: [<f8e38445>] usb_serial_disconnect+0xb5/0x100 [usbserial]
[11812.301871] *pde = 357ef067 *pte = 00000000
[11812.301957] Oops: 0000 [#1] PREEMPT SMP
[11812.301983] Modules linked in: usbserial(-) [last unloaded: pl2303]
[11812.302008]
[11812.302019] Pid: 1323, comm: modprobe Tainted: G        W    3.4.0-rc7+ raspberrypi#101 Dell Inc. Vostro 1520/0T816J
[11812.302115] EIP: 0060:[<f8e38445>] EFLAGS: 00010246 CPU: 1
[11812.302130] EIP is at usb_serial_disconnect+0xb5/0x100 [usbserial]
[11812.302141] EAX: f508a180 EBX: f508a180 ECX: 00000000 EDX: f8e74300
[11812.302151] ESI: f5050800 EDI: 00000001 EBP: f5141e78 ESP: f5141e58
[11812.302160]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[11812.302170] CR0: 8005003b CR2: f8e7438c CR3: 34848000 CR4: 000007d0
[11812.302180] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[11812.302189] DR6: ffff0ff0 DR7: 00000400
[11812.302199] Process modprobe (pid: 1323, ti=f5140000 task=f61e2bc0 task.ti=f5140000)
[11812.302209] Stack:
[11812.302216]  f8e3be0f f8e3b29c f8e3ae00 00000000 f513641c f5136400 f513641c f507a540
[11812.302325]  f5141e98 c133d2c1 00000000 00000000 f509c400 f513641c f507a590 f5136450
[11812.302372]  f5141ea8 c12f0344 f513641c f507a590 f5141ebc c12f0c67 00000000 f507a590
[11812.302419] Call Trace:
[11812.302439]  [<c133d2c1>] usb_unbind_interface+0x51/0x190
[11812.302456]  [<c12f0344>] __device_release_driver+0x64/0xb0
[11812.302469]  [<c12f0c67>] driver_detach+0x97/0xa0
[11812.302483]  [<c12f001c>] bus_remove_driver+0x6c/0xe0
[11812.302500]  [<c145938d>] ? __mutex_unlock_slowpath+0xcd/0x140
[11812.302514]  [<c12f0ff9>] driver_unregister+0x49/0x80
[11812.302528]  [<c1457df6>] ? printk+0x1d/0x1f
[11812.302540]  [<c133c50d>] usb_deregister+0x5d/0xb0
[11812.302557]  [<f8e37c55>] ? usb_serial_deregister+0x45/0x50 [usbserial]
[11812.302575]  [<f8e37c8d>] usb_serial_deregister_drivers+0x2d/0x40 [usbserial]
[11812.302593]  [<f8e3a6e2>] usb_serial_generic_deregister+0x12/0x20 [usbserial]
[11812.302611]  [<f8e3acf0>] usb_serial_exit+0x8/0x32 [usbserial]
[11812.302716]  [<c1080b48>] sys_delete_module+0x158/0x260
[11812.302730]  [<c110594e>] ? mntput+0x1e/0x30
[11812.302746]  [<c145c3c3>] ? sysenter_exit+0xf/0x18
[11812.302746]  [<c107777c>] ? trace_hardirqs_on_caller+0xec/0x170
[11812.302746]  [<c145c390>] sysenter_do_call+0x12/0x36
[11812.302746] Code: 24 02 00 00 e8 dd f3 20 c8 f6 86 74 02 00 00 02 74 b4 8d 86 4c 02 00 00 47 e8 78 55 4b c8 0f b6 43 0e 39 f8 7f a9 8b 53 04 89 d8 <ff> 92 8c 00 00 00 89 d8 e8 0e ff ff ff 8b 45 f0 c7 44 24 04 2f
[11812.302746] EIP: [<f8e38445>] usb_serial_disconnect+0xb5/0x100 [usbserial] SS:ESP 0068:f5141e58
[11812.302746] CR2: 00000000f8e7438c

Fix by only evaluating serial drivers pointing back to the
USB driver we are currently probing.  This still allows two
or more drivers to match the same device, running their
serial driver probes to sort out which one to use.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Tested-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
popcornmix pushed a commit that referenced this issue Jun 8, 2014
nf_ct_gre_keymap_flush() removes a nf_ct_gre_keymap object from
net_gre->keymap_list and frees the object. But it doesn't clean
a reference on this object from ct_pptp_info->keymap[dir].
Then nf_ct_gre_keymap_destroy() may release the same object again.

So nf_ct_gre_keymap_flush() can be called only when we are sure that
when nf_ct_gre_keymap_destroy will not be called.

nf_ct_gre_keymap is created by nf_ct_gre_keymap_add() and the right way
to destroy it is to call nf_ct_gre_keymap_destroy().

This patch marks nf_ct_gre_keymap_flush() as static, so this patch can
break compilation of third party modules, which use
nf_ct_gre_keymap_flush. I'm not sure this is the right way to deprecate
this function.

[  226.540793] general protection fault: 0000 [#1] SMP
[  226.541750] Modules linked in: nf_nat_pptp nf_nat_proto_gre
nf_conntrack_pptp nf_conntrack_proto_gre ip_gre ip_tunnel gre
ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc xt_nat
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack veth tun bridge stp llc ppdev microcode joydev pcspkr
serio_raw virtio_console virtio_balloon floppy parport_pc parport
pvpanic i2c_piix4 virtio_net drm_kms_helper ttm ata_generic virtio_pci
virtio_ring virtio drm i2c_core pata_acpi [last unloaded: ip_tunnel]
[  226.541776] CPU: 0 PID: 49 Comm: kworker/u4:2 Not tainted 3.14.0-rc8+ #101
[  226.541776] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  226.541776] Workqueue: netns cleanup_net
[  226.541776] task: ffff8800371e0000 ti: ffff88003730c000 task.ti: ffff88003730c000
[  226.541776] RIP: 0010:[<ffffffff81389ba9>]  [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0
[  226.541776] RSP: 0018:ffff88003730dbd0  EFLAGS: 00010a83
[  226.541776] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800374e6c40 RCX: dead000000200200
[  226.541776] RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800371e07d0 RDI: ffff8800374e6c40
[  226.541776] RBP: ffff88003730dbd0 R08: 0000000000000000 R09: 0000000000000000
[  226.541776] R10: 0000000000000001 R11: ffff88003730d92e R12: 0000000000000002
[  226.541776] R13: ffff88007a4c42d0 R14: ffff88007aef0000 R15: ffff880036cf0018
[  226.541776] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[  226.541776] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  226.541776] CR2: 00007f07f643f7d0 CR3: 0000000036fd2000 CR4: 00000000000006f0
[  226.541776] Stack:
[  226.541776]  ffff88003730dbe8 ffffffff81389c5d ffff8800374ffbe4 ffff88003730dc28
[  226.541776]  ffffffffa0162a43 ffffffffa01627c5 ffff88007a4c42d0 ffff88007aef0000
[  226.541776]  ffffffffa01651c0 ffff88007a4c45e0 ffff88007aef0000 ffff88003730dc40
[  226.541776] Call Trace:
[  226.541776]  [<ffffffff81389c5d>] list_del+0xd/0x30
[  226.541776]  [<ffffffffa0162a43>] nf_ct_gre_keymap_destroy+0x283/0x2d0 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffffa01627c5>] ? nf_ct_gre_keymap_destroy+0x5/0x2d0 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffffa0162ab7>] gre_destroy+0x27/0x70 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffffa0117de3>] destroy_conntrack+0x83/0x200 [nf_conntrack]
[  226.541776]  [<ffffffffa0117d87>] ? destroy_conntrack+0x27/0x200 [nf_conntrack]
[  226.541776]  [<ffffffffa0117d60>] ? nf_conntrack_hash_check_insert+0x2e0/0x2e0 [nf_conntrack]
[  226.541776]  [<ffffffff81630142>] nf_conntrack_destroy+0x72/0x180
[  226.541776]  [<ffffffff816300d5>] ? nf_conntrack_destroy+0x5/0x180
[  226.541776]  [<ffffffffa011ef80>] ? kill_l3proto+0x20/0x20 [nf_conntrack]
[  226.541776]  [<ffffffffa011847e>] nf_ct_iterate_cleanup+0x14e/0x170 [nf_conntrack]
[  226.541776]  [<ffffffffa011f74b>] nf_ct_l4proto_pernet_unregister+0x5b/0x90 [nf_conntrack]
[  226.541776]  [<ffffffffa0162409>] proto_gre_net_exit+0x19/0x30 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffff815edf89>] ops_exit_list.isra.1+0x39/0x60
[  226.541776]  [<ffffffff815eecc0>] cleanup_net+0x100/0x1d0
[  226.541776]  [<ffffffff810a608a>] process_one_work+0x1ea/0x4f0
[  226.541776]  [<ffffffff810a6028>] ? process_one_work+0x188/0x4f0
[  226.541776]  [<ffffffff810a64ab>] worker_thread+0x11b/0x3a0
[  226.541776]  [<ffffffff810a6390>] ? process_one_work+0x4f0/0x4f0
[  226.541776]  [<ffffffff810af42d>] kthread+0xed/0x110
[  226.541776]  [<ffffffff8173d4dc>] ? _raw_spin_unlock_irq+0x2c/0x40
[  226.541776]  [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200
[  226.541776]  [<ffffffff8174747c>] ret_from_fork+0x7c/0xb0
[  226.541776]  [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200
[  226.541776] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de
48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48
39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89
42 08
[  226.541776] RIP  [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0
[  226.541776]  RSP <ffff88003730dbd0>
[  226.612193] ---[ end trace 985ae23ddfcc357c ]---

Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
popcornmix pushed a commit that referenced this issue May 9, 2016
In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty #101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Cc: <stable@vger.kernel.org>
Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Gregor Boirie <gregor.boirie@parrot.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
popcornmix pushed a commit that referenced this issue May 13, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty #101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Gregor Boirie <gregor.boirie@parrot.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue May 13, 2016
commit 07d2390 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty #101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Gregor Boirie <gregor.boirie@parrot.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Feb 20, 2017
We cannot do printk() from tk_debug_account_sleep_time(), because
tk_debug_account_sleep_time() is called under tk_core seq lock.
The reason why printk() is unsafe there is that console_sem may
invoke scheduler (up()->wake_up_process()->activate_task()), which,
in turn, can return back to timekeeping code, for instance, via
get_time()->ktime_get(), deadlocking the system on tk_core seq lock.

[   48.950592] ======================================================
[   48.950622] [ INFO: possible circular locking dependency detected ]
[   48.950622] 4.10.0-rc7-next-20170213+ #101 Not tainted
[   48.950622] -------------------------------------------------------
[   48.950622] kworker/0:0/3 is trying to acquire lock:
[   48.950653]  (tk_core){----..}, at: [<c01cc624>] retrigger_next_event+0x4c/0x90
[   48.950683]
               but task is already holding lock:
[   48.950683]  (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[   48.950714]
               which lock already depends on the new lock.

[   48.950714]
               the existing dependency chain (in reverse order) is:
[   48.950714]
               -> #5 (hrtimer_bases.lock){-.-...}:
[   48.950744]        _raw_spin_lock_irqsave+0x50/0x64
[   48.950775]        lock_hrtimer_base+0x28/0x58
[   48.950775]        hrtimer_start_range_ns+0x20/0x5c8
[   48.950775]        __enqueue_rt_entity+0x320/0x360
[   48.950805]        enqueue_rt_entity+0x2c/0x44
[   48.950805]        enqueue_task_rt+0x24/0x94
[   48.950836]        ttwu_do_activate+0x54/0xc0
[   48.950836]        try_to_wake_up+0x248/0x5c8
[   48.950836]        __setup_irq+0x420/0x5f0
[   48.950836]        request_threaded_irq+0xdc/0x184
[   48.950866]        devm_request_threaded_irq+0x58/0xa4
[   48.950866]        omap_i2c_probe+0x530/0x6a0
[   48.950897]        platform_drv_probe+0x50/0xb0
[   48.950897]        driver_probe_device+0x1f8/0x2cc
[   48.950897]        __driver_attach+0xc0/0xc4
[   48.950927]        bus_for_each_dev+0x6c/0xa0
[   48.950927]        bus_add_driver+0x100/0x210
[   48.950927]        driver_register+0x78/0xf4
[   48.950958]        do_one_initcall+0x3c/0x16c
[   48.950958]        kernel_init_freeable+0x20c/0x2d8
[   48.950958]        kernel_init+0x8/0x110
[   48.950988]        ret_from_fork+0x14/0x24
[   48.950988]
               -> #4 (&rt_b->rt_runtime_lock){-.-...}:
[   48.951019]        _raw_spin_lock+0x40/0x50
[   48.951019]        rq_offline_rt+0x9c/0x2bc
[   48.951019]        set_rq_offline.part.2+0x2c/0x58
[   48.951049]        rq_attach_root+0x134/0x144
[   48.951049]        cpu_attach_domain+0x18c/0x6f4
[   48.951049]        build_sched_domains+0xba4/0xd80
[   48.951080]        sched_init_smp+0x68/0x10c
[   48.951080]        kernel_init_freeable+0x160/0x2d8
[   48.951080]        kernel_init+0x8/0x110
[   48.951080]        ret_from_fork+0x14/0x24
[   48.951110]
               -> #3 (&rq->lock){-.-.-.}:
[   48.951110]        _raw_spin_lock+0x40/0x50
[   48.951141]        task_fork_fair+0x30/0x124
[   48.951141]        sched_fork+0x194/0x2e0
[   48.951141]        copy_process.part.5+0x448/0x1a20
[   48.951171]        _do_fork+0x98/0x7e8
[   48.951171]        kernel_thread+0x2c/0x34
[   48.951171]        rest_init+0x1c/0x18c
[   48.951202]        start_kernel+0x35c/0x3d4
[   48.951202]        0x8000807c
[   48.951202]
               -> #2 (&p->pi_lock){-.-.-.}:
[   48.951232]        _raw_spin_lock_irqsave+0x50/0x64
[   48.951232]        try_to_wake_up+0x30/0x5c8
[   48.951232]        up+0x4c/0x60
[   48.951263]        __up_console_sem+0x2c/0x58
[   48.951263]        console_unlock+0x3b4/0x650
[   48.951263]        vprintk_emit+0x270/0x474
[   48.951293]        vprintk_default+0x20/0x28
[   48.951293]        printk+0x20/0x30
[   48.951324]        kauditd_hold_skb+0x94/0xb8
[   48.951324]        kauditd_thread+0x1a4/0x56c
[   48.951324]        kthread+0x104/0x148
[   48.951354]        ret_from_fork+0x14/0x24
[   48.951354]
               -> #1 ((console_sem).lock){-.....}:
[   48.951385]        _raw_spin_lock_irqsave+0x50/0x64
[   48.951385]        down_trylock+0xc/0x2c
[   48.951385]        __down_trylock_console_sem+0x24/0x80
[   48.951385]        console_trylock+0x10/0x8c
[   48.951416]        vprintk_emit+0x264/0x474
[   48.951416]        vprintk_default+0x20/0x28
[   48.951416]        printk+0x20/0x30
[   48.951446]        tk_debug_account_sleep_time+0x5c/0x70
[   48.951446]        __timekeeping_inject_sleeptime.constprop.3+0x170/0x1a0
[   48.951446]        timekeeping_resume+0x218/0x23c
[   48.951477]        syscore_resume+0x94/0x42c
[   48.951477]        suspend_enter+0x554/0x9b4
[   48.951477]        suspend_devices_and_enter+0xd8/0x4b4
[   48.951507]        enter_state+0x934/0xbd4
[   48.951507]        pm_suspend+0x14/0x70
[   48.951507]        state_store+0x68/0xc8
[   48.951538]        kernfs_fop_write+0xf4/0x1f8
[   48.951538]        __vfs_write+0x1c/0x114
[   48.951538]        vfs_write+0xa0/0x168
[   48.951568]        SyS_write+0x3c/0x90
[   48.951568]        __sys_trace_return+0x0/0x10
[   48.951568]
               -> #0 (tk_core){----..}:
[   48.951599]        lock_acquire+0xe0/0x294
[   48.951599]        ktime_get_update_offsets_now+0x5c/0x1d4
[   48.951629]        retrigger_next_event+0x4c/0x90
[   48.951629]        on_each_cpu+0x40/0x7c
[   48.951629]        clock_was_set_work+0x14/0x20
[   48.951660]        process_one_work+0x2b4/0x808
[   48.951660]        worker_thread+0x3c/0x550
[   48.951660]        kthread+0x104/0x148
[   48.951690]        ret_from_fork+0x14/0x24
[   48.951690]
               other info that might help us debug this:

[   48.951690] Chain exists of:
                 tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock

[   48.951721]  Possible unsafe locking scenario:

[   48.951721]        CPU0                    CPU1
[   48.951721]        ----                    ----
[   48.951721]   lock(hrtimer_bases.lock);
[   48.951751]                                lock(&rt_b->rt_runtime_lock);
[   48.951751]                                lock(hrtimer_bases.lock);
[   48.951751]   lock(tk_core);
[   48.951782]
                *** DEADLOCK ***

[   48.951782] 3 locks held by kworker/0:0/3:
[   48.951782]  #0:  ("events"){.+.+.+}, at: [<c0156590>] process_one_work+0x1f8/0x808
[   48.951812]  #1:  (hrtimer_work){+.+...}, at: [<c0156590>] process_one_work+0x1f8/0x808
[   48.951843]  #2:  (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[   48.951843]   stack backtrace:
[   48.951873] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.10.0-rc7-next-20170213+
[   48.951904] Workqueue: events clock_was_set_work
[   48.951904] [<c0110208>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14)
[   48.951934] [<c010c224>] (show_stack) from [<c04ca6c0>] (dump_stack+0xac/0xe0)
[   48.951934] [<c04ca6c0>] (dump_stack) from [<c019b5cc>] (print_circular_bug+0x1d0/0x308)
[   48.951965] [<c019b5cc>] (print_circular_bug) from [<c019d2a8>] (validate_chain+0xf50/0x1324)
[   48.951965] [<c019d2a8>] (validate_chain) from [<c019ec18>] (__lock_acquire+0x468/0x7e8)
[   48.951995] [<c019ec18>] (__lock_acquire) from [<c019f634>] (lock_acquire+0xe0/0x294)
[   48.951995] [<c019f634>] (lock_acquire) from [<c01d0ea0>] (ktime_get_update_offsets_now+0x5c/0x1d4)
[   48.952026] [<c01d0ea0>] (ktime_get_update_offsets_now) from [<c01cc624>] (retrigger_next_event+0x4c/0x90)
[   48.952026] [<c01cc624>] (retrigger_next_event) from [<c01e4e24>] (on_each_cpu+0x40/0x7c)
[   48.952056] [<c01e4e24>] (on_each_cpu) from [<c01cafc4>] (clock_was_set_work+0x14/0x20)
[   48.952056] [<c01cafc4>] (clock_was_set_work) from [<c015664c>] (process_one_work+0x2b4/0x808)
[   48.952087] [<c015664c>] (process_one_work) from [<c0157774>] (worker_thread+0x3c/0x550)
[   48.952087] [<c0157774>] (worker_thread) from [<c015d644>] (kthread+0x104/0x148)
[   48.952087] [<c015d644>] (kthread) from [<c0107830>] (ret_from_fork+0x14/0x24)

Replace printk() with printk_deferred(), which does not call into
the scheduler.

Fixes: 0bf43f1 ("timekeeping: Prints the amounts of time spent during suspend")
Reported-and-tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: "[4.9+]" <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
popcornmix pushed a commit that referenced this issue Feb 24, 2017
commit f222449 upstream.

We cannot do printk() from tk_debug_account_sleep_time(), because
tk_debug_account_sleep_time() is called under tk_core seq lock.
The reason why printk() is unsafe there is that console_sem may
invoke scheduler (up()->wake_up_process()->activate_task()), which,
in turn, can return back to timekeeping code, for instance, via
get_time()->ktime_get(), deadlocking the system on tk_core seq lock.

[   48.950592] ======================================================
[   48.950622] [ INFO: possible circular locking dependency detected ]
[   48.950622] 4.10.0-rc7-next-20170213+ #101 Not tainted
[   48.950622] -------------------------------------------------------
[   48.950622] kworker/0:0/3 is trying to acquire lock:
[   48.950653]  (tk_core){----..}, at: [<c01cc624>] retrigger_next_event+0x4c/0x90
[   48.950683]
               but task is already holding lock:
[   48.950683]  (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[   48.950714]
               which lock already depends on the new lock.

[   48.950714]
               the existing dependency chain (in reverse order) is:
[   48.950714]
               -> #5 (hrtimer_bases.lock){-.-...}:
[   48.950744]        _raw_spin_lock_irqsave+0x50/0x64
[   48.950775]        lock_hrtimer_base+0x28/0x58
[   48.950775]        hrtimer_start_range_ns+0x20/0x5c8
[   48.950775]        __enqueue_rt_entity+0x320/0x360
[   48.950805]        enqueue_rt_entity+0x2c/0x44
[   48.950805]        enqueue_task_rt+0x24/0x94
[   48.950836]        ttwu_do_activate+0x54/0xc0
[   48.950836]        try_to_wake_up+0x248/0x5c8
[   48.950836]        __setup_irq+0x420/0x5f0
[   48.950836]        request_threaded_irq+0xdc/0x184
[   48.950866]        devm_request_threaded_irq+0x58/0xa4
[   48.950866]        omap_i2c_probe+0x530/0x6a0
[   48.950897]        platform_drv_probe+0x50/0xb0
[   48.950897]        driver_probe_device+0x1f8/0x2cc
[   48.950897]        __driver_attach+0xc0/0xc4
[   48.950927]        bus_for_each_dev+0x6c/0xa0
[   48.950927]        bus_add_driver+0x100/0x210
[   48.950927]        driver_register+0x78/0xf4
[   48.950958]        do_one_initcall+0x3c/0x16c
[   48.950958]        kernel_init_freeable+0x20c/0x2d8
[   48.950958]        kernel_init+0x8/0x110
[   48.950988]        ret_from_fork+0x14/0x24
[   48.950988]
               -> #4 (&rt_b->rt_runtime_lock){-.-...}:
[   48.951019]        _raw_spin_lock+0x40/0x50
[   48.951019]        rq_offline_rt+0x9c/0x2bc
[   48.951019]        set_rq_offline.part.2+0x2c/0x58
[   48.951049]        rq_attach_root+0x134/0x144
[   48.951049]        cpu_attach_domain+0x18c/0x6f4
[   48.951049]        build_sched_domains+0xba4/0xd80
[   48.951080]        sched_init_smp+0x68/0x10c
[   48.951080]        kernel_init_freeable+0x160/0x2d8
[   48.951080]        kernel_init+0x8/0x110
[   48.951080]        ret_from_fork+0x14/0x24
[   48.951110]
               -> #3 (&rq->lock){-.-.-.}:
[   48.951110]        _raw_spin_lock+0x40/0x50
[   48.951141]        task_fork_fair+0x30/0x124
[   48.951141]        sched_fork+0x194/0x2e0
[   48.951141]        copy_process.part.5+0x448/0x1a20
[   48.951171]        _do_fork+0x98/0x7e8
[   48.951171]        kernel_thread+0x2c/0x34
[   48.951171]        rest_init+0x1c/0x18c
[   48.951202]        start_kernel+0x35c/0x3d4
[   48.951202]        0x8000807c
[   48.951202]
               -> #2 (&p->pi_lock){-.-.-.}:
[   48.951232]        _raw_spin_lock_irqsave+0x50/0x64
[   48.951232]        try_to_wake_up+0x30/0x5c8
[   48.951232]        up+0x4c/0x60
[   48.951263]        __up_console_sem+0x2c/0x58
[   48.951263]        console_unlock+0x3b4/0x650
[   48.951263]        vprintk_emit+0x270/0x474
[   48.951293]        vprintk_default+0x20/0x28
[   48.951293]        printk+0x20/0x30
[   48.951324]        kauditd_hold_skb+0x94/0xb8
[   48.951324]        kauditd_thread+0x1a4/0x56c
[   48.951324]        kthread+0x104/0x148
[   48.951354]        ret_from_fork+0x14/0x24
[   48.951354]
               -> #1 ((console_sem).lock){-.....}:
[   48.951385]        _raw_spin_lock_irqsave+0x50/0x64
[   48.951385]        down_trylock+0xc/0x2c
[   48.951385]        __down_trylock_console_sem+0x24/0x80
[   48.951385]        console_trylock+0x10/0x8c
[   48.951416]        vprintk_emit+0x264/0x474
[   48.951416]        vprintk_default+0x20/0x28
[   48.951416]        printk+0x20/0x30
[   48.951446]        tk_debug_account_sleep_time+0x5c/0x70
[   48.951446]        __timekeeping_inject_sleeptime.constprop.3+0x170/0x1a0
[   48.951446]        timekeeping_resume+0x218/0x23c
[   48.951477]        syscore_resume+0x94/0x42c
[   48.951477]        suspend_enter+0x554/0x9b4
[   48.951477]        suspend_devices_and_enter+0xd8/0x4b4
[   48.951507]        enter_state+0x934/0xbd4
[   48.951507]        pm_suspend+0x14/0x70
[   48.951507]        state_store+0x68/0xc8
[   48.951538]        kernfs_fop_write+0xf4/0x1f8
[   48.951538]        __vfs_write+0x1c/0x114
[   48.951538]        vfs_write+0xa0/0x168
[   48.951568]        SyS_write+0x3c/0x90
[   48.951568]        __sys_trace_return+0x0/0x10
[   48.951568]
               -> #0 (tk_core){----..}:
[   48.951599]        lock_acquire+0xe0/0x294
[   48.951599]        ktime_get_update_offsets_now+0x5c/0x1d4
[   48.951629]        retrigger_next_event+0x4c/0x90
[   48.951629]        on_each_cpu+0x40/0x7c
[   48.951629]        clock_was_set_work+0x14/0x20
[   48.951660]        process_one_work+0x2b4/0x808
[   48.951660]        worker_thread+0x3c/0x550
[   48.951660]        kthread+0x104/0x148
[   48.951690]        ret_from_fork+0x14/0x24
[   48.951690]
               other info that might help us debug this:

[   48.951690] Chain exists of:
                 tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock

[   48.951721]  Possible unsafe locking scenario:

[   48.951721]        CPU0                    CPU1
[   48.951721]        ----                    ----
[   48.951721]   lock(hrtimer_bases.lock);
[   48.951751]                                lock(&rt_b->rt_runtime_lock);
[   48.951751]                                lock(hrtimer_bases.lock);
[   48.951751]   lock(tk_core);
[   48.951782]
                *** DEADLOCK ***

[   48.951782] 3 locks held by kworker/0:0/3:
[   48.951782]  #0:  ("events"){.+.+.+}, at: [<c0156590>] process_one_work+0x1f8/0x808
[   48.951812]  #1:  (hrtimer_work){+.+...}, at: [<c0156590>] process_one_work+0x1f8/0x808
[   48.951843]  #2:  (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[   48.951843]   stack backtrace:
[   48.951873] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.10.0-rc7-next-20170213+
[   48.951904] Workqueue: events clock_was_set_work
[   48.951904] [<c0110208>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14)
[   48.951934] [<c010c224>] (show_stack) from [<c04ca6c0>] (dump_stack+0xac/0xe0)
[   48.951934] [<c04ca6c0>] (dump_stack) from [<c019b5cc>] (print_circular_bug+0x1d0/0x308)
[   48.951965] [<c019b5cc>] (print_circular_bug) from [<c019d2a8>] (validate_chain+0xf50/0x1324)
[   48.951965] [<c019d2a8>] (validate_chain) from [<c019ec18>] (__lock_acquire+0x468/0x7e8)
[   48.951995] [<c019ec18>] (__lock_acquire) from [<c019f634>] (lock_acquire+0xe0/0x294)
[   48.951995] [<c019f634>] (lock_acquire) from [<c01d0ea0>] (ktime_get_update_offsets_now+0x5c/0x1d4)
[   48.952026] [<c01d0ea0>] (ktime_get_update_offsets_now) from [<c01cc624>] (retrigger_next_event+0x4c/0x90)
[   48.952026] [<c01cc624>] (retrigger_next_event) from [<c01e4e24>] (on_each_cpu+0x40/0x7c)
[   48.952056] [<c01e4e24>] (on_each_cpu) from [<c01cafc4>] (clock_was_set_work+0x14/0x20)
[   48.952056] [<c01cafc4>] (clock_was_set_work) from [<c015664c>] (process_one_work+0x2b4/0x808)
[   48.952087] [<c015664c>] (process_one_work) from [<c0157774>] (worker_thread+0x3c/0x550)
[   48.952087] [<c0157774>] (worker_thread) from [<c015d644>] (kthread+0x104/0x148)
[   48.952087] [<c015d644>] (kthread) from [<c0107830>] (ret_from_fork+0x14/0x24)

Replace printk() with printk_deferred(), which does not call into
the scheduler.

Fixes: 0bf43f1 ("timekeeping: Prints the amounts of time spent during suspend")
Reported-and-tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
anholt referenced this issue in anholt/linux Apr 8, 2017
Unbind the vma may happen at any time, outside of the normal GT wakeref.
As such it relies on having a wakeref of its own. However, we can forgo
clearing the register whilst the device is asleep and just mark it as
unused - so that when we do wake up the device, we will clear the unused
fence register (see i915_gem_restore_fences).

[22423.944631] WARNING: CPU: 3 PID: 26178 at drivers/gpu/drm/i915/intel_drv.h:1739 i915_vma_put_fence+0xf3/0x100 [i915]
[22423.946053] RPM wakelock ref not held during HW access
[22423.946056] Modules linked in: vgem(E) i915(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) x86_pkg_temp_thermal(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) intel_gtt(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) evdev(E) aesni_intel(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) sysimgblt(E) fb_sys_fops(E) prime_numbers(E) drm(E) efivars(E) mei_me(E) lpc_ich(E) mei(E) mfd_core(E) battery(E) video(E) acpi_pad(E) button(E) tpm_tis(E) tpm_tis_core(E) tpm(E) autofs4(E) i2c_i801(E) thermal(E) fan(E) i2c_designware_platform(E) i2c_designware_core(E)
[22423.946438] CPU: 2 PID: 26178 Comm: gem_concurrent_ Tainted: G            E   4.10.0+ #101
[22423.946513] Hardware name: ��������������������������������� ���������������������������������/���������������������������������, BIOS RYBDWi35.86A.0246.2
[22423.946600] Call Trace:
[22423.946641]  dump_stack+0x68/0x9f
[22423.946703]  __warn+0x107/0x130
[22423.946763]  warn_slowpath_fmt+0xa8/0xe0
[22423.946825]  ? __warn+0x130/0x130
[22423.946868]  ? free_hot_cold_page_list+0x53/0x70
[22423.946942]  ? mark_lock+0xcc/0x7f0
[22423.946997]  ? __lock_is_held+0x84/0x100
[22423.947115]  ? i915_vma_put_fence+0x64/0x100 [i915]
[22423.947224]  i915_vma_put_fence+0xf3/0x100 [i915]
[22423.947335]  i915_vma_unbind+0x4da/0x560 [i915]
[22423.947387]  ? rb_erase+0x812/0x8a0
[22423.947439]  ? kfree+0xa2/0xd0
[22423.947562]  i915_vma_close+0x159/0x180 [i915]
[22423.947674]  intel_ring_free+0x31/0x50 [i915]
[22423.947776]  i915_gem_context_free+0x1ff/0x3d0 [i915]
[22423.947887]  context_close+0x106/0x110 [i915]
[22423.947989]  context_idr_cleanup+0xc/0x10 [i915]
[22423.948041]  idr_for_each+0x14d/0x1d0
[22423.948158]  ? context_close+0x110/0x110 [i915]
[22423.948206]  ? get_from_free_list+0x70/0x70
[22423.948261]  ? __lock_is_held+0x84/0x100
[22423.948325]  ? __mutex_unlock_slowpath+0xd4/0x400
[22423.948448]  i915_gem_context_close+0x4b/0x90 [i915]
[22423.948544]  i915_driver_preclose+0x28/0x50 [i915]
[22423.948620]  drm_release+0x175/0x690 [drm]
[22423.948681]  ? fcntl_setlk+0x5e0/0x5e0
[22423.948746]  __fput+0x17d/0x300
[22423.948807]  ____fput+0x9/0x10
[22423.948859]  task_work_run+0xa7/0xe0
[22423.948924]  do_exit+0x4d2/0x13e0
[22423.948986]  ? mm_update_next_owner+0x320/0x320
[22423.949051]  ? __do_page_fault+0x209/0x5c0
[22423.949110]  ? mark_held_locks+0x23/0xc0
[22423.949166]  ? entry_SYSCALL_64_fastpath+0x5/0xb1
[22423.949232]  do_group_exit+0x93/0x160
[22423.949289]  SyS_exit_group+0x18/0x20
[22423.949350]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[22423.949403] RIP: 0033:0x7f9cc2e154c8
[22423.949484] RSP: 002b:00007ffd7e81b448 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[22423.949557] RAX: ffffffffffffffda RBX: ffffffff810ef1f0 RCX: 00007f9cc2e154c8
[22423.949617] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[22423.949677] RBP: ffff880367e9ff98 R08: 00000000000000e7 R09: ffffffffffffff88
[22423.949741] R10: 00007f9cc1d5c000 R11: 0000000000000246 R12: 00007f9cc30f6c30
[22423.949798] R13: 0000000000000000 R14: 00007f9cc30f6c20 R15: 0000000000000003
[22423.949868]  ? trace_hardirqs_off_caller+0xc0/0x110

v2: Move the rpm check down a layer so that we still perform the
vma/fence update required for the deferred mmio write on resume.
v3: Don't touch i915_gem_object_set_cache_level() and leave the rpm to
the low level routines (such as i915_vma_put_fence).
v4: vma may be null in fence_write, so extract drm_i915_private from
fence->i915

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170306092916.11623-3-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
TiejunChina pushed a commit to TiejunChina/linux that referenced this issue Feb 2, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> raspberrypi#3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> raspberrypi#2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ raspberrypi#101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit to TiejunChina/linux that referenced this issue Feb 14, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> raspberrypi#3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> raspberrypi#2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ raspberrypi#101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit to TiejunChina/linux that referenced this issue Feb 23, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> raspberrypi#3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> raspberrypi#2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ raspberrypi#101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit to TiejunChina/linux that referenced this issue Feb 24, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> raspberrypi#3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> raspberrypi#2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ raspberrypi#101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit that referenced this issue Mar 6, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit that referenced this issue Mar 31, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit that referenced this issue Apr 14, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit that referenced this issue May 12, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit that referenced this issue May 28, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit that referenced this issue Jul 6, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit that referenced this issue Aug 2, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
popcornmix pushed a commit that referenced this issue Aug 7, 2018
[ Upstream commit eae5f70 ]

__printf is useful to verify format and arguments. Fix arg mismatch
reported by gcc, remove the following warnings (with W=1):

  arch/powerpc/kernel/prom_init.c:1467:31: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1471:31: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1504:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1505:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1506:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1507:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1508:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1509:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1975:39: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘unsigned int’
  arch/powerpc/kernel/prom_init.c:1986:27: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:2567:38: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:2567:46: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 3 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:2569:38: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:2569:46: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 3 has type ‘long unsigned int’

The patch also include arg mismatch fix for case with #define DEBUG_PROM
(warning not listed here).

This patch fix also the following warnings revealed by checkpatch:

  WARNING: Prefer using '"%s...", __func__' to using 'alloc_up', this function's name, in a string
  #101: FILE: arch/powerpc/kernel/prom_init.c:1235:
  + prom_debug("alloc_up(%lx, %lx)\n", size, align);

and

  WARNING: Prefer using '"%s...", __func__' to using 'alloc_down', this function's name, in a string
  #138: FILE: arch/powerpc/kernel/prom_init.c:1278:
  + prom_debug("alloc_down(%lx, %lx, %s)\n", size, align,

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Aug 7, 2018
[ Upstream commit eae5f70 ]

__printf is useful to verify format and arguments. Fix arg mismatch
reported by gcc, remove the following warnings (with W=1):

  arch/powerpc/kernel/prom_init.c:1467:31: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1471:31: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1504:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1505:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1506:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1507:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1508:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1509:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:1975:39: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘unsigned int’
  arch/powerpc/kernel/prom_init.c:1986:27: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:2567:38: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:2567:46: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 3 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:2569:38: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
  arch/powerpc/kernel/prom_init.c:2569:46: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 3 has type ‘long unsigned int’

The patch also include arg mismatch fix for case with #define DEBUG_PROM
(warning not listed here).

This patch fix also the following warnings revealed by checkpatch:

  WARNING: Prefer using '"%s...", __func__' to using 'alloc_up', this function's name, in a string
  #101: FILE: arch/powerpc/kernel/prom_init.c:1235:
  + prom_debug("alloc_up(%lx, %lx)\n", size, align);

and

  WARNING: Prefer using '"%s...", __func__' to using 'alloc_down', this function's name, in a string
  #138: FILE: arch/powerpc/kernel/prom_init.c:1278:
  + prom_debug("alloc_down(%lx, %lx, %s)\n", size, align,

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
TiejunChina pushed a commit that referenced this issue Aug 27, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit that referenced this issue Oct 10, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
TiejunChina pushed a commit that referenced this issue Nov 19, 2018
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
popcornmix pushed a commit that referenced this issue Dec 24, 2018
redundant copy_from_user in rds_sendmsg system call expose rds
to issue where rds_rdma_extra_size walk the rds iovec and and
calculate the number pf pages (sgs) it need to add to the tail of
rds message and later rds_cmsg_rdma_args copy the rds iovec again
and re calculate the same number and get different result causing
WARN_ON in rds_message_alloc_sgs.

fix this by doing the copy_from_user only once per rds_sendmsg
system call.

When issue occur the below dump is seen:

WARNING: CPU: 0 PID: 19789 at net/rds/message.c:316 rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 19789 Comm: syz-executor827 Not tainted 4.19.0-next-20181030+ #101
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x244/0x39d lib/dump_stack.c:113
 panic+0x2ad/0x55c kernel/panic.c:188
 __warn.cold.8+0x20/0x45 kernel/panic.c:540
 report_bug+0x254/0x2d0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
 do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Code: c0 74 04 3c 03 7e 6c 44 01 ab 78 01 00 00 e8 2b 9e 35 fa 4c 89 e0 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 14 9e 35 fa <0f> 0b 31 ff 44 89 ee e8 18 9f 35 fa 45 85 ed 75 1b e8 fe 9d 35 fa
RSP: 0018:ffff8801c51b7460 EFLAGS: 00010293
RAX: ffff8801bc412080 RBX: ffff8801d7bf4040 RCX: ffffffff8749c9e6
RDX: 0000000000000000 RSI: ffffffff8749ca5c RDI: 0000000000000004
RBP: ffff8801c51b7490 R08: ffff8801bc412080 R09: ffffed003b5c5b67
R10: ffffed003b5c5b67 R11: ffff8801dae2db3b R12: 0000000000000000
R13: 000000000007165c R14: 000000000007165c R15: 0000000000000005
 rds_cmsg_rdma_args+0x82d/0x1510 net/rds/rdma.c:623
 rds_cmsg_send net/rds/send.c:971 [inline]
 rds_sendmsg+0x19a2/0x3180 net/rds/send.c:1273
 sock_sendmsg_nosec net/socket.c:622 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:632
 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2117
 __sys_sendmsg+0x11d/0x280 net/socket.c:2155
 __do_sys_sendmsg net/socket.c:2164 [inline]
 __se_sys_sendmsg net/socket.c:2162 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x44a859
Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 6b cb fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f1d4710ada8 EFLAGS: 00000297 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000006dcc28 RCX: 000000000044a859
RDX: 0000000000000000 RSI: 0000000020001600 RDI: 0000000000000003
RBP: 00000000006dcc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000297 R12: 00000000006dcc2c
R13: 646e732f7665642f R14: 00007f1d4710b9c0 R15: 00000000006dcd2c
Kernel Offset: disabled
Rebooting in 86400 seconds..

Reported-by: syzbot+26de17458aeda9d305d8@syzkaller.appspotmail.com
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TiejunChina pushed a commit that referenced this issue Jan 7, 2019
On RT we run into circular locking with pendingb_lock (workqueue),
port_lock_key (uart) and the primary_crng (random):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 -------------------------------------------------------
 irq/4-serial/512 is trying to acquire lock:
  ((pendingb_lock).lock){+.+...}, at: [<ffffffff8108d48d>] queue_work_on+0x5d/0x190

 but task is already holding lock:
  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&port_lock_key){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        serial8250_console_write+0x211/0x220
        univ8250_console_write+0x1c/0x20
        console_unlock+0x563/0x5c0
        vprintk_emit+0x277/0x320
        vprintk_default+0x1a/0x20
        vprintk_func+0x20/0x80
        printk+0x3e/0x46
        crng_fast_load+0xde/0xe0
        add_interrupt_randomness+0x16c/0x1a0
        irq_thread+0x15c/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 -> #2 (primary_crng.lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        _extract_crng+0x39/0xa0
        extract_crng+0x3a/0x40
        get_random_u32+0x120/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        __kmalloc+0x254/0x3a0
        pcpu_mem_zalloc+0x3a/0x70
        percpu_init_late+0x4f/0x8a
        start_kernel+0x1ec/0x3b8
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x13d/0x14c
        verify_cpu+0x0/0xfc

 -> #1 ((batched_entropy_u32_lock).lock){+.+...}:
        lock_acquire+0xac/0x240
        rt_spin_lock__no_mg+0x41/0x50
        get_random_u32+0x64/0x190
        new_slab+0x1d6/0x7c0
        ___slab_alloc+0x30b/0x6f0
        __slab_alloc.isra.78+0x6c/0xc0
        kmem_cache_alloc+0x26a/0x370
        __debug_object_init+0x325/0x460
        debug_object_activate+0x11c/0x1f0
        __queue_work+0x2c/0x770
        queue_work_on+0x12a/0x190
        serio_queue_event+0xd3/0x140
        __serio_register_port+0x17e/0x1a0
        i8042_probe+0x623/0x687
        platform_drv_probe+0x36/0x90
        driver_probe_device+0x1f8/0x2e0
        __driver_attach+0x96/0xa0
        bus_for_each_dev+0x5d/0x90
        driver_attach+0x19/0x20
        bus_add_driver+0x125/0x220
        driver_register+0x5b/0xd0
        __platform_driver_probe+0x5b/0x120
        __platform_create_bundle+0xaa/0xd0
        i8042_init+0x3f1/0x430
        do_one_initcall+0x3e/0x180
        kernel_init_freeable+0x212/0x295
        kernel_init+0x9/0x100
        ret_from_fork+0x31/0x40

 -> #0 ((pendingb_lock).lock){+.+...}:
        __lock_acquire+0x11b4/0x1320
        lock_acquire+0xac/0x240
        rt_spin_lock+0x46/0x50
        queue_work_on+0x5d/0x190
        tty_flip_buffer_push+0x26/0x30
        serial8250_rx_chars+0x120/0x1f0
        serial8250_handle_irq.part.27+0x58/0xb0
        serial8250_default_handle_irq+0x4b/0x60
        serial8250_interrupt+0x5f/0xd0
        irq_forced_thread_fn+0x1e/0x70
        irq_thread+0x137/0x1e0
        kthread+0x112/0x150
        ret_from_fork+0x31/0x40

 other info that might help us debug this:

 Chain exists of:
   (pendingb_lock).lock --> primary_crng.lock --> &port_lock_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&port_lock_key);
                                lock(primary_crng.lock);
                                lock(&port_lock_key);
   lock((pendingb_lock).lock);

  *** DEADLOCK ***

 2 locks held by irq/4-serial/512:
  #0:  (&i->lock){+.+...}, at: [<ffffffff815b0400>] serial8250_interrupt+0x30/0xd0
  #1:  (&port_lock_key){+.+...}, at: [<ffffffff815b4bb6>] serial8250_handle_irq.part.27+0x16/0xb0

 stack backtrace:
 CPU: 4 PID: 512 Comm: irq/4-serial Not tainted 4.11.3-rt0+ #101
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xc1
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x11b4/0x1320
  lock_acquire+0xac/0x240
  rt_spin_lock+0x46/0x50
  queue_work_on+0x5d/0x190
  tty_flip_buffer_push+0x26/0x30
  serial8250_rx_chars+0x120/0x1f0
  serial8250_handle_irq.part.27+0x58/0xb0
  serial8250_default_handle_irq+0x4b/0x60
  serial8250_interrupt+0x5f/0xd0
  irq_forced_thread_fn+0x1e/0x70
  irq_thread+0x137/0x1e0
  kthread+0x112/0x150
  ret_from_fork+0x31/0x40

It should work if we delay that printk after dropping the lock but we
also could skip it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
popcornmix pushed a commit that referenced this issue Apr 23, 2019
[ Upstream commit ea01007 ]

redundant copy_from_user in rds_sendmsg system call expose rds
to issue where rds_rdma_extra_size walk the rds iovec and and
calculate the number pf pages (sgs) it need to add to the tail of
rds message and later rds_cmsg_rdma_args copy the rds iovec again
and re calculate the same number and get different result causing
WARN_ON in rds_message_alloc_sgs.

fix this by doing the copy_from_user only once per rds_sendmsg
system call.

When issue occur the below dump is seen:

WARNING: CPU: 0 PID: 19789 at net/rds/message.c:316 rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 19789 Comm: syz-executor827 Not tainted 4.19.0-next-20181030+ #101
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x244/0x39d lib/dump_stack.c:113
 panic+0x2ad/0x55c kernel/panic.c:188
 __warn.cold.8+0x20/0x45 kernel/panic.c:540
 report_bug+0x254/0x2d0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
 do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Code: c0 74 04 3c 03 7e 6c 44 01 ab 78 01 00 00 e8 2b 9e 35 fa 4c 89 e0 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 14 9e 35 fa <0f> 0b 31 ff 44 89 ee e8 18 9f 35 fa 45 85 ed 75 1b e8 fe 9d 35 fa
RSP: 0018:ffff8801c51b7460 EFLAGS: 00010293
RAX: ffff8801bc412080 RBX: ffff8801d7bf4040 RCX: ffffffff8749c9e6
RDX: 0000000000000000 RSI: ffffffff8749ca5c RDI: 0000000000000004
RBP: ffff8801c51b7490 R08: ffff8801bc412080 R09: ffffed003b5c5b67
R10: ffffed003b5c5b67 R11: ffff8801dae2db3b R12: 0000000000000000
R13: 000000000007165c R14: 000000000007165c R15: 0000000000000005
 rds_cmsg_rdma_args+0x82d/0x1510 net/rds/rdma.c:623
 rds_cmsg_send net/rds/send.c:971 [inline]
 rds_sendmsg+0x19a2/0x3180 net/rds/send.c:1273
 sock_sendmsg_nosec net/socket.c:622 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:632
 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2117
 __sys_sendmsg+0x11d/0x280 net/socket.c:2155
 __do_sys_sendmsg net/socket.c:2164 [inline]
 __se_sys_sendmsg net/socket.c:2162 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x44a859
Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 6b cb fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f1d4710ada8 EFLAGS: 00000297 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000006dcc28 RCX: 000000000044a859
RDX: 0000000000000000 RSI: 0000000020001600 RDI: 0000000000000003
RBP: 00000000006dcc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000297 R12: 00000000006dcc2c
R13: 646e732f7665642f R14: 00007f1d4710b9c0 R15: 00000000006dcd2c
Kernel Offset: disabled
Rebooting in 86400 seconds..

Reported-by: syzbot+26de17458aeda9d305d8@syzkaller.appspotmail.com
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
popcornmix pushed a commit that referenced this issue Aug 28, 2019
[ Upstream commit 32d3182 ]

packet_sendmsg() checks tx_ring.pg_vec to decide
if it must call tpacket_snd().

Problem is that the check is lockless, meaning another thread
can issue a concurrent setsockopt(PACKET_TX_RING ) to flip
tx_ring.pg_vec back to NULL.

Given that tpacket_snd() grabs pg_vec_lock mutex, we can
perform the check again to solve the race.

syzbot reported :

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 11429 Comm: syz-executor394 Not tainted 5.3.0-rc4+ #101
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:packet_lookup_frame+0x8d/0x270 net/packet/af_packet.c:474
Code: c1 ee 03 f7 73 0c 80 3c 0e 00 0f 85 cb 01 00 00 48 8b 0b 89 c0 4c 8d 24 c1 48 b8 00 00 00 00 00 fc ff df 4c 89 e1 48 c1 e9 03 <80> 3c 01 00 0f 85 94 01 00 00 48 8d 7b 10 4d 8b 3c 24 48 b8 00 00
RSP: 0018:ffff88809f82f7b8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff8880a45c7030 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffff110148b8e06 RDI: ffff8880a45c703c
RBP: ffff88809f82f7e8 R08: ffff888087aea200 R09: fffffbfff134ae50
R10: fffffbfff134ae4f R11: ffffffff89a5727f R12: 0000000000000000
R13: 0000000000000001 R14: ffff8880a45c6ac0 R15: 0000000000000000
FS:  00007fa04716f700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa04716edb8 CR3: 0000000091eb4000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 packet_current_frame net/packet/af_packet.c:487 [inline]
 tpacket_snd net/packet/af_packet.c:2667 [inline]
 packet_sendmsg+0x590/0x6250 net/packet/af_packet.c:2975
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:657
 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
 __do_sys_sendmmsg net/socket.c:2442 [inline]
 __se_sys_sendmmsg net/socket.c:2439 [inline]
 __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439
 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 69e3c75 ("net: TX_RING and packet mmap")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Aug 28, 2019
packet_sendmsg() checks tx_ring.pg_vec to decide
if it must call tpacket_snd().

Problem is that the check is lockless, meaning another thread
can issue a concurrent setsockopt(PACKET_TX_RING ) to flip
tx_ring.pg_vec back to NULL.

Given that tpacket_snd() grabs pg_vec_lock mutex, we can
perform the check again to solve the race.

syzbot reported :

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 11429 Comm: syz-executor394 Not tainted 5.3.0-rc4+ #101
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:packet_lookup_frame+0x8d/0x270 net/packet/af_packet.c:474
Code: c1 ee 03 f7 73 0c 80 3c 0e 00 0f 85 cb 01 00 00 48 8b 0b 89 c0 4c 8d 24 c1 48 b8 00 00 00 00 00 fc ff df 4c 89 e1 48 c1 e9 03 <80> 3c 01 00 0f 85 94 01 00 00 48 8d 7b 10 4d 8b 3c 24 48 b8 00 00
RSP: 0018:ffff88809f82f7b8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff8880a45c7030 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffff110148b8e06 RDI: ffff8880a45c703c
RBP: ffff88809f82f7e8 R08: ffff888087aea200 R09: fffffbfff134ae50
R10: fffffbfff134ae4f R11: ffffffff89a5727f R12: 0000000000000000
R13: 0000000000000001 R14: ffff8880a45c6ac0 R15: 0000000000000000
FS:  00007fa04716f700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa04716edb8 CR3: 0000000091eb4000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 packet_current_frame net/packet/af_packet.c:487 [inline]
 tpacket_snd net/packet/af_packet.c:2667 [inline]
 packet_sendmsg+0x590/0x6250 net/packet/af_packet.c:2975
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:657
 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
 __do_sys_sendmmsg net/socket.c:2442 [inline]
 __se_sys_sendmmsg net/socket.c:2439 [inline]
 __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439
 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 69e3c75 ("net: TX_RING and packet mmap")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
popcornmix pushed a commit that referenced this issue Aug 28, 2019
[ Upstream commit 32d3182 ]

packet_sendmsg() checks tx_ring.pg_vec to decide
if it must call tpacket_snd().

Problem is that the check is lockless, meaning another thread
can issue a concurrent setsockopt(PACKET_TX_RING ) to flip
tx_ring.pg_vec back to NULL.

Given that tpacket_snd() grabs pg_vec_lock mutex, we can
perform the check again to solve the race.

syzbot reported :

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 11429 Comm: syz-executor394 Not tainted 5.3.0-rc4+ #101
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:packet_lookup_frame+0x8d/0x270 net/packet/af_packet.c:474
Code: c1 ee 03 f7 73 0c 80 3c 0e 00 0f 85 cb 01 00 00 48 8b 0b 89 c0 4c 8d 24 c1 48 b8 00 00 00 00 00 fc ff df 4c 89 e1 48 c1 e9 03 <80> 3c 01 00 0f 85 94 01 00 00 48 8d 7b 10 4d 8b 3c 24 48 b8 00 00
RSP: 0018:ffff88809f82f7b8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff8880a45c7030 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffff110148b8e06 RDI: ffff8880a45c703c
RBP: ffff88809f82f7e8 R08: ffff888087aea200 R09: fffffbfff134ae50
R10: fffffbfff134ae4f R11: ffffffff89a5727f R12: 0000000000000000
R13: 0000000000000001 R14: ffff8880a45c6ac0 R15: 0000000000000000
FS:  00007fa04716f700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa04716edb8 CR3: 0000000091eb4000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 packet_current_frame net/packet/af_packet.c:487 [inline]
 tpacket_snd net/packet/af_packet.c:2667 [inline]
 packet_sendmsg+0x590/0x6250 net/packet/af_packet.c:2975
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:657
 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
 __do_sys_sendmmsg net/socket.c:2442 [inline]
 __se_sys_sendmmsg net/socket.c:2439 [inline]
 __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439
 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 69e3c75 ("net: TX_RING and packet mmap")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
margro pushed a commit to margro/linux that referenced this issue Jan 5, 2020
[ Upstream commit 32d3182 ]

packet_sendmsg() checks tx_ring.pg_vec to decide
if it must call tpacket_snd().

Problem is that the check is lockless, meaning another thread
can issue a concurrent setsockopt(PACKET_TX_RING ) to flip
tx_ring.pg_vec back to NULL.

Given that tpacket_snd() grabs pg_vec_lock mutex, we can
perform the check again to solve the race.

syzbot reported :

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [raspberrypi#1] PREEMPT SMP KASAN
CPU: 1 PID: 11429 Comm: syz-executor394 Not tainted 5.3.0-rc4+ raspberrypi#101
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:packet_lookup_frame+0x8d/0x270 net/packet/af_packet.c:474
Code: c1 ee 03 f7 73 0c 80 3c 0e 00 0f 85 cb 01 00 00 48 8b 0b 89 c0 4c 8d 24 c1 48 b8 00 00 00 00 00 fc ff df 4c 89 e1 48 c1 e9 03 <80> 3c 01 00 0f 85 94 01 00 00 48 8d 7b 10 4d 8b 3c 24 48 b8 00 00
RSP: 0018:ffff88809f82f7b8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff8880a45c7030 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffff110148b8e06 RDI: ffff8880a45c703c
RBP: ffff88809f82f7e8 R08: ffff888087aea200 R09: fffffbfff134ae50
R10: fffffbfff134ae4f R11: ffffffff89a5727f R12: 0000000000000000
R13: 0000000000000001 R14: ffff8880a45c6ac0 R15: 0000000000000000
FS:  00007fa04716f700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa04716edb8 CR3: 0000000091eb4000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 packet_current_frame net/packet/af_packet.c:487 [inline]
 tpacket_snd net/packet/af_packet.c:2667 [inline]
 packet_sendmsg+0x590/0x6250 net/packet/af_packet.c:2975
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:657
 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
 __do_sys_sendmmsg net/socket.c:2442 [inline]
 __se_sys_sendmmsg net/socket.c:2439 [inline]
 __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439
 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 69e3c75 ("net: TX_RING and packet mmap")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Apr 27, 2020
commit 9ac26e9 upstream.

With current ALU32 subreg handling and retval refine fix from last
patches we see an expected failure in test_verifier. With verbose
verifier state being printed at each step for clarity we have the
following relavent lines [I omit register states that are not
necessarily useful to see failure cause],

#101/p bpf_get_stack return R0 within range FAIL
Failed to load prog 'Success'!
[..]
14: (85) call bpf_get_stack#67
 R0_w=map_value(id=0,off=0,ks=8,vs=48,imm=0)
 R3_w=inv48
15:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
15: (b7) r1 = 0
16:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
16: (bf) r8 = r0
17:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
17: (67) r8 <<= 32
18:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smax_value=9223372032559808512,
               umax_value=18446744069414584320,
               var_off=(0x0; 0xffffffff00000000),
               s32_min_value=0,
               s32_max_value=0,
               u32_max_value=0,
               var32_off=(0x0; 0x0))
18: (c7) r8 s>>= 32
19
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smin_value=-2147483648,
               smax_value=2147483647,
               var32_off=(0x0; 0xffffffff))
19: (cd) if r1 s< r8 goto pc+16
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smin_value=-2147483648,
               smax_value=0,
               var32_off=(0x0; 0xffffffff))
20:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smin_value=-2147483648,
               smax_value=0,
 R9=inv48
20: (1f) r9 -= r8
21: (bf) r2 = r7
22:
 R2_w=map_value(id=0,off=0,ks=8,vs=48,imm=0)
22: (0f) r2 += r8
value -2147483648 makes map_value pointer be out of bounds

After call bpf_get_stack() on line 14 and some moves we have at line 16
an r8 bound with max_value 48 but an unknown min value. This is to be
expected bpf_get_stack call can only return a max of the input size but
is free to return any negative error in the 32-bit register space. The
C helper is returning an int so will use lower 32-bits.

Lines 17 and 18 clear the top 32 bits with a left/right shift but use
ARSH so we still have worst case min bound before line 19 of -2147483648.
At this point the signed check 'r1 s< r8' meant to protect the addition
on line 22 where dst reg is a map_value pointer may very well return
true with a large negative number. Then the final line 22 will detect
this as an invalid operation and fail the program. What we want to do
is proceed only if r8 is positive non-error. So change 'r1 s< r8' to
'r1 s> r8' so that we jump if r8 is negative.

Next we will throw an error because we access past the end of the map
value. The map value size is 48 and sizeof(struct test_val) is 48 so
we walk off the end of the map value on the second call to
get bpf_get_stack(). Fix this by changing sizeof(struct test_val) to
24 by using 'sizeof(struct test_val) / 2'. After this everything passes
as expected.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/158560426019.10843.3285429543232025187.stgit@john-Precision-5820-Tower
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Apr 27, 2020
[ no upstream commit ]

Switch the comparison, so that is_branch_taken() will recognize that below
branch is never taken:

  [...]
  17: [...] R1_w=inv0 [...] R8_w=inv(id=0,smin_value=-2147483648,smax_value=-1,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) [...]
  17: (67) r8 <<= 32
  18: [...] R8_w=inv(id=0,smax_value=-4294967296,umin_value=9223372036854775808,umax_value=18446744069414584320,var_off=(0x8000000000000000; 0x7fffffff00000000)) [...]
  18: (c7) r8 s>>= 32
  19: [...] R8_w=inv(id=0,smin_value=-2147483648,smax_value=-1,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) [...]
  19: (6d) if r1 s> r8 goto pc+16
  [...] R1_w=inv0 [...] R8_w=inv(id=0,smin_value=-2147483648,smax_value=-1,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) [...]
  [...]

Currently we check for is_branch_taken() only if either K is source, or source
is a scalar value that is const. For upstream it would be good to extend this
properly to check whether dst is const and src not.

For the sake of the test_verifier, it is probably not needed here:

  # ./test_verifier 101
  #101/p bpf_get_stack return R0 within range OK
  Summary: 1 PASSED, 0 SKIPPED, 0 FAILED

I haven't seen this issue in test_progs* though, they are passing fine:

  # ./test_progs-no_alu32 -t get_stack
  Switching to flavor 'no_alu32' subdirectory...
  #20 get_stack_raw_tp:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

  # ./test_progs -t get_stack
  #20 get_stack_raw_tp:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Apr 27, 2020
commit 9ac26e9 upstream.

With current ALU32 subreg handling and retval refine fix from last
patches we see an expected failure in test_verifier. With verbose
verifier state being printed at each step for clarity we have the
following relavent lines [I omit register states that are not
necessarily useful to see failure cause],

#101/p bpf_get_stack return R0 within range FAIL
Failed to load prog 'Success'!
[..]
14: (85) call bpf_get_stack#67
 R0_w=map_value(id=0,off=0,ks=8,vs=48,imm=0)
 R3_w=inv48
15:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
15: (b7) r1 = 0
16:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
16: (bf) r8 = r0
17:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
17: (67) r8 <<= 32
18:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smax_value=9223372032559808512,
               umax_value=18446744069414584320,
               var_off=(0x0; 0xffffffff00000000),
               s32_min_value=0,
               s32_max_value=0,
               u32_max_value=0,
               var32_off=(0x0; 0x0))
18: (c7) r8 s>>= 32
19
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smin_value=-2147483648,
               smax_value=2147483647,
               var32_off=(0x0; 0xffffffff))
19: (cd) if r1 s< r8 goto pc+16
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smin_value=-2147483648,
               smax_value=0,
               var32_off=(0x0; 0xffffffff))
20:
 R0=inv(id=0,smax_value=48,var32_off=(0x0; 0xffffffff))
 R1_w=inv0
 R8_w=inv(id=0,smin_value=-2147483648,
               smax_value=0,
 R9=inv48
20: (1f) r9 -= r8
21: (bf) r2 = r7
22:
 R2_w=map_value(id=0,off=0,ks=8,vs=48,imm=0)
22: (0f) r2 += r8
value -2147483648 makes map_value pointer be out of bounds

After call bpf_get_stack() on line 14 and some moves we have at line 16
an r8 bound with max_value 48 but an unknown min value. This is to be
expected bpf_get_stack call can only return a max of the input size but
is free to return any negative error in the 32-bit register space. The
C helper is returning an int so will use lower 32-bits.

Lines 17 and 18 clear the top 32 bits with a left/right shift but use
ARSH so we still have worst case min bound before line 19 of -2147483648.
At this point the signed check 'r1 s< r8' meant to protect the addition
on line 22 where dst reg is a map_value pointer may very well return
true with a large negative number. Then the final line 22 will detect
this as an invalid operation and fail the program. What we want to do
is proceed only if r8 is positive non-error. So change 'r1 s< r8' to
'r1 s> r8' so that we jump if r8 is negative.

Next we will throw an error because we access past the end of the map
value. The map value size is 48 and sizeof(struct test_val) is 48 so
we walk off the end of the map value on the second call to
get bpf_get_stack(). Fix this by changing sizeof(struct test_val) to
24 by using 'sizeof(struct test_val) / 2'. After this everything passes
as expected.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/158560426019.10843.3285429543232025187.stgit@john-Precision-5820-Tower
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Apr 27, 2020
[ no upstream commit ]

Switch the comparison, so that is_branch_taken() will recognize that below
branch is never taken:

  [...]
  17: [...] R1_w=inv0 [...] R8_w=inv(id=0,smin_value=-2147483648,smax_value=-1,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) [...]
  17: (67) r8 <<= 32
  18: [...] R8_w=inv(id=0,smax_value=-4294967296,umin_value=9223372036854775808,umax_value=18446744069414584320,var_off=(0x8000000000000000; 0x7fffffff00000000)) [...]
  18: (c7) r8 s>>= 32
  19: [...] R8_w=inv(id=0,smin_value=-2147483648,smax_value=-1,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) [...]
  19: (6d) if r1 s> r8 goto pc+16
  [...] R1_w=inv0 [...] R8_w=inv(id=0,smin_value=-2147483648,smax_value=-1,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) [...]
  [...]

Currently we check for is_branch_taken() only if either K is source, or source
is a scalar value that is const. For upstream it would be good to extend this
properly to check whether dst is const and src not.

For the sake of the test_verifier, it is probably not needed here:

  # ./test_verifier 101
  #101/p bpf_get_stack return R0 within range OK
  Summary: 1 PASSED, 0 SKIPPED, 0 FAILED

I haven't seen this issue in test_progs* though, they are passing fine:

  # ./test_progs-no_alu32 -t get_stack
  Switching to flavor 'no_alu32' subdirectory...
  #20 get_stack_raw_tp:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

  # ./test_progs -t get_stack
  #20 get_stack_raw_tp:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Oct 29, 2020
[ Upstream commit ae284d8 ]

syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, unmounting an
f2fs filesystem could result in the following splat:

  kobject: 'loop5' ((____ptrval____)): kobject_release, parent 0000000000000000 (delayed 250)
  kobject: 'f2fs_xattr_entry-7:5' ((____ptrval____)): kobject_release, parent 0000000000000000 (delayed 750)
  ------------[ cut here ]------------
  ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98
  WARNING: CPU: 0 PID: 699 at lib/debugobjects.c:485 debug_print_object+0x180/0x240
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 0 PID: 699 Comm: syz-executor.5 Tainted: G S                5.9.0-rc8+ #101
  Hardware name: linux,dummy-virt (DT)
  Call trace:
   dump_backtrace+0x0/0x4d8
   show_stack+0x34/0x48
   dump_stack+0x174/0x1f8
   panic+0x360/0x7a0
   __warn+0x244/0x2ec
   report_bug+0x240/0x398
   bug_handler+0x50/0xc0
   call_break_hook+0x160/0x1d8
   brk_handler+0x30/0xc0
   do_debug_exception+0x184/0x340
   el1_dbg+0x48/0xb0
   el1_sync_handler+0x170/0x1c8
   el1_sync+0x80/0x100
   debug_print_object+0x180/0x240
   debug_check_no_obj_freed+0x200/0x430
   slab_free_freelist_hook+0x190/0x210
   kfree+0x13c/0x460
   f2fs_put_super+0x624/0xa58
   generic_shutdown_super+0x120/0x300
   kill_block_super+0x94/0xf8
   kill_f2fs_super+0x244/0x308
   deactivate_locked_super+0x104/0x150
   deactivate_super+0x118/0x148
   cleanup_mnt+0x27c/0x3c0
   __cleanup_mnt+0x28/0x38
   task_work_run+0x10c/0x248
   do_notify_resume+0x9d4/0x1188
   work_pending+0x8/0x34c

Like the error handling for f2fs_register_sysfs(), we need to wait for
the kobject to be destroyed before returning to prevent a potential
use-after-free.

Fixes: bf9e697 ("f2fs: expose features to sysfs entry")
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Chao Yu <chao@kernel.org>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
popcornmix pushed a commit that referenced this issue Oct 29, 2020
[ Upstream commit ae284d8 ]

syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, unmounting an
f2fs filesystem could result in the following splat:

  kobject: 'loop5' ((____ptrval____)): kobject_release, parent 0000000000000000 (delayed 250)
  kobject: 'f2fs_xattr_entry-7:5' ((____ptrval____)): kobject_release, parent 0000000000000000 (delayed 750)
  ------------[ cut here ]------------
  ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98
  WARNING: CPU: 0 PID: 699 at lib/debugobjects.c:485 debug_print_object+0x180/0x240
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 0 PID: 699 Comm: syz-executor.5 Tainted: G S                5.9.0-rc8+ #101
  Hardware name: linux,dummy-virt (DT)
  Call trace:
   dump_backtrace+0x0/0x4d8
   show_stack+0x34/0x48
   dump_stack+0x174/0x1f8
   panic+0x360/0x7a0
   __warn+0x244/0x2ec
   report_bug+0x240/0x398
   bug_handler+0x50/0xc0
   call_break_hook+0x160/0x1d8
   brk_handler+0x30/0xc0
   do_debug_exception+0x184/0x340
   el1_dbg+0x48/0xb0
   el1_sync_handler+0x170/0x1c8
   el1_sync+0x80/0x100
   debug_print_object+0x180/0x240
   debug_check_no_obj_freed+0x200/0x430
   slab_free_freelist_hook+0x190/0x210
   kfree+0x13c/0x460
   f2fs_put_super+0x624/0xa58
   generic_shutdown_super+0x120/0x300
   kill_block_super+0x94/0xf8
   kill_f2fs_super+0x244/0x308
   deactivate_locked_super+0x104/0x150
   deactivate_super+0x118/0x148
   cleanup_mnt+0x27c/0x3c0
   __cleanup_mnt+0x28/0x38
   task_work_run+0x10c/0x248
   do_notify_resume+0x9d4/0x1188
   work_pending+0x8/0x34c

Like the error handling for f2fs_register_sysfs(), we need to wait for
the kobject to be destroyed before returning to prevent a potential
use-after-free.

Fixes: bf9e697 ("f2fs: expose features to sysfs entry")
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Chao Yu <chao@kernel.org>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
popcornmix pushed a commit that referenced this issue Nov 4, 2020
[ Upstream commit ae284d8 ]

syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, unmounting an
f2fs filesystem could result in the following splat:

  kobject: 'loop5' ((____ptrval____)): kobject_release, parent 0000000000000000 (delayed 250)
  kobject: 'f2fs_xattr_entry-7:5' ((____ptrval____)): kobject_release, parent 0000000000000000 (delayed 750)
  ------------[ cut here ]------------
  ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98
  WARNING: CPU: 0 PID: 699 at lib/debugobjects.c:485 debug_print_object+0x180/0x240
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 0 PID: 699 Comm: syz-executor.5 Tainted: G S                5.9.0-rc8+ #101
  Hardware name: linux,dummy-virt (DT)
  Call trace:
   dump_backtrace+0x0/0x4d8
   show_stack+0x34/0x48
   dump_stack+0x174/0x1f8
   panic+0x360/0x7a0
   __warn+0x244/0x2ec
   report_bug+0x240/0x398
   bug_handler+0x50/0xc0
   call_break_hook+0x160/0x1d8
   brk_handler+0x30/0xc0
   do_debug_exception+0x184/0x340
   el1_dbg+0x48/0xb0
   el1_sync_handler+0x170/0x1c8
   el1_sync+0x80/0x100
   debug_print_object+0x180/0x240
   debug_check_no_obj_freed+0x200/0x430
   slab_free_freelist_hook+0x190/0x210
   kfree+0x13c/0x460
   f2fs_put_super+0x624/0xa58
   generic_shutdown_super+0x120/0x300
   kill_block_super+0x94/0xf8
   kill_f2fs_super+0x244/0x308
   deactivate_locked_super+0x104/0x150
   deactivate_super+0x118/0x148
   cleanup_mnt+0x27c/0x3c0
   __cleanup_mnt+0x28/0x38
   task_work_run+0x10c/0x248
   do_notify_resume+0x9d4/0x1188
   work_pending+0x8/0x34c

Like the error handling for f2fs_register_sysfs(), we need to wait for
the kobject to be destroyed before returning to prevent a potential
use-after-free.

Fixes: bf9e697 ("f2fs: expose features to sysfs entry")
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Chao Yu <chao@kernel.org>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
sigmaris pushed a commit to sigmaris/linux that referenced this issue Nov 7, 2020
[ Upstream commit c2a04b0 ]

syzkaller found the following splat with CONFIG_DEBUG_KOBJECT_RELEASE=y:

  Read of size 1 at addr ffff000028e896b8 by task kworker/1:2/228

  CPU: 1 PID: 228 Comm: kworker/1:2 Tainted: G S                5.9.0-rc8+ raspberrypi#101
  Hardware name: linux,dummy-virt (DT)
  Workqueue: events kobject_delayed_cleanup
  Call trace:
   dump_backtrace+0x0/0x4d8
   show_stack+0x34/0x48
   dump_stack+0x174/0x1f8
   print_address_description.constprop.0+0x5c/0x550
   kasan_report+0x13c/0x1c0
   __asan_report_load1_noabort+0x34/0x60
   memcmp+0xd0/0xd8
   gfs2_uevent+0xc4/0x188
   kobject_uevent_env+0x54c/0x1240
   kobject_uevent+0x2c/0x40
   __kobject_del+0x190/0x1d8
   kobject_delayed_cleanup+0x2bc/0x3b8
   process_one_work+0x96c/0x18c0
   worker_thread+0x3f0/0xc30
   kthread+0x390/0x498
   ret_from_fork+0x10/0x18

  Allocated by task 1110:
   kasan_save_stack+0x28/0x58
   __kasan_kmalloc.isra.0+0xc8/0xe8
   kasan_kmalloc+0x10/0x20
   kmem_cache_alloc_trace+0x1d8/0x2f0
   alloc_super+0x64/0x8c0
   sget_fc+0x110/0x620
   get_tree_bdev+0x190/0x648
   gfs2_get_tree+0x50/0x228
   vfs_get_tree+0x84/0x2e8
   path_mount+0x1134/0x1da8
   do_mount+0x124/0x138
   __arm64_sys_mount+0x164/0x238
   el0_svc_common.constprop.0+0x15c/0x598
   do_el0_svc+0x60/0x150
   el0_svc+0x34/0xb0
   el0_sync_handler+0xc8/0x5b4
   el0_sync+0x15c/0x180

  Freed by task 228:
   kasan_save_stack+0x28/0x58
   kasan_set_track+0x28/0x40
   kasan_set_free_info+0x24/0x48
   __kasan_slab_free+0x118/0x190
   kasan_slab_free+0x14/0x20
   slab_free_freelist_hook+0x6c/0x210
   kfree+0x13c/0x460

Use the same pattern as f2fs + ext4 where the kobject destruction must
complete before allowing the FS itself to be freed.  This means that we
need an explicit free_sbd in the callers.

Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
[Also go to fail_free when init_names fails.]
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
popcornmix pushed a commit that referenced this issue Nov 9, 2020
[ Upstream commit c2a04b0 ]

syzkaller found the following splat with CONFIG_DEBUG_KOBJECT_RELEASE=y:

  Read of size 1 at addr ffff000028e896b8 by task kworker/1:2/228

  CPU: 1 PID: 228 Comm: kworker/1:2 Tainted: G S                5.9.0-rc8+ #101
  Hardware name: linux,dummy-virt (DT)
  Workqueue: events kobject_delayed_cleanup
  Call trace:
   dump_backtrace+0x0/0x4d8
   show_stack+0x34/0x48
   dump_stack+0x174/0x1f8
   print_address_description.constprop.0+0x5c/0x550
   kasan_report+0x13c/0x1c0
   __asan_report_load1_noabort+0x34/0x60
   memcmp+0xd0/0xd8
   gfs2_uevent+0xc4/0x188
   kobject_uevent_env+0x54c/0x1240
   kobject_uevent+0x2c/0x40
   __kobject_del+0x190/0x1d8
   kobject_delayed_cleanup+0x2bc/0x3b8
   process_one_work+0x96c/0x18c0
   worker_thread+0x3f0/0xc30
   kthread+0x390/0x498
   ret_from_fork+0x10/0x18

  Allocated by task 1110:
   kasan_save_stack+0x28/0x58
   __kasan_kmalloc.isra.0+0xc8/0xe8
   kasan_kmalloc+0x10/0x20
   kmem_cache_alloc_trace+0x1d8/0x2f0
   alloc_super+0x64/0x8c0
   sget_fc+0x110/0x620
   get_tree_bdev+0x190/0x648
   gfs2_get_tree+0x50/0x228
   vfs_get_tree+0x84/0x2e8
   path_mount+0x1134/0x1da8
   do_mount+0x124/0x138
   __arm64_sys_mount+0x164/0x238
   el0_svc_common.constprop.0+0x15c/0x598
   do_el0_svc+0x60/0x150
   el0_svc+0x34/0xb0
   el0_sync_handler+0xc8/0x5b4
   el0_sync+0x15c/0x180

  Freed by task 228:
   kasan_save_stack+0x28/0x58
   kasan_set_track+0x28/0x40
   kasan_set_free_info+0x24/0x48
   __kasan_slab_free+0x118/0x190
   kasan_slab_free+0x14/0x20
   slab_free_freelist_hook+0x6c/0x210
   kfree+0x13c/0x460

Use the same pattern as f2fs + ext4 where the kobject destruction must
complete before allowing the FS itself to be freed.  This means that we
need an explicit free_sbd in the callers.

Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
[Also go to fail_free when init_names fails.]
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue Apr 19, 2022
Running generic/406 causes the following WARNING in btrfs_destroy_inode()
which tells there are outstanding extents left.

In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
extents with btrfs_delalloc_reserve_metadata() (or indirectly from
btrfs_delalloc_reserve_space(()). We then release the outstanding extents
with btrfs_delalloc_release_extents(). However, the "len" can be modified
in the COW case, which releases fewer outstanding extents than expected.

Fix it by calling btrfs_delalloc_release_extents() for the original length.

To reproduce the warning, the filesystem should be 1 GiB.  It's
triggering a short-write, due to not being able to allocate a large
extent and instead allocating a smaller one.

  WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
  Modules linked in: btrfs blake2b_generic xor lzo_compress
  lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
  zsmalloc
  CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ raspberrypi#101
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
  RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
  RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
  RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
  RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
  RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
  R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
  R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
  FS:  00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   destroy_inode+0x33/0x70
   dispose_list+0x43/0x60
   evict_inodes+0x161/0x1b0
   generic_shutdown_super+0x2d/0x110
   kill_anon_super+0xf/0x20
   btrfs_kill_super+0xd/0x20 [btrfs]
   deactivate_locked_super+0x27/0x90
   cleanup_mnt+0x12c/0x180
   task_work_run+0x54/0x80
   exit_to_user_mode_prepare+0x152/0x160
   syscall_exit_to_user_mode+0x12/0x30
   do_syscall_64+0x42/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae
   RIP: 0033:0x7f854a000fb7

Fixes: f0bfa76 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
CC: stable@vger.kernel.org # 5.17
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue Apr 20, 2022
commit 6d82ad1 upstream.

Running generic/406 causes the following WARNING in btrfs_destroy_inode()
which tells there are outstanding extents left.

In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
extents with btrfs_delalloc_reserve_metadata() (or indirectly from
btrfs_delalloc_reserve_space(()). We then release the outstanding extents
with btrfs_delalloc_release_extents(). However, the "len" can be modified
in the COW case, which releases fewer outstanding extents than expected.

Fix it by calling btrfs_delalloc_release_extents() for the original length.

To reproduce the warning, the filesystem should be 1 GiB.  It's
triggering a short-write, due to not being able to allocate a large
extent and instead allocating a smaller one.

  WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
  Modules linked in: btrfs blake2b_generic xor lzo_compress
  lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
  zsmalloc
  CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ raspberrypi#101
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
  RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
  RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
  RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
  RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
  RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
  R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
  R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
  FS:  00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   destroy_inode+0x33/0x70
   dispose_list+0x43/0x60
   evict_inodes+0x161/0x1b0
   generic_shutdown_super+0x2d/0x110
   kill_anon_super+0xf/0x20
   btrfs_kill_super+0xd/0x20 [btrfs]
   deactivate_locked_super+0x27/0x90
   cleanup_mnt+0x12c/0x180
   task_work_run+0x54/0x80
   exit_to_user_mode_prepare+0x152/0x160
   syscall_exit_to_user_mode+0x12/0x30
   do_syscall_64+0x42/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae
   RIP: 0033:0x7f854a000fb7

Fixes: f0bfa76 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
CC: stable@vger.kernel.org # 5.17
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Apr 26, 2022
commit 6d82ad1 upstream.

Running generic/406 causes the following WARNING in btrfs_destroy_inode()
which tells there are outstanding extents left.

In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
extents with btrfs_delalloc_reserve_metadata() (or indirectly from
btrfs_delalloc_reserve_space(()). We then release the outstanding extents
with btrfs_delalloc_release_extents(). However, the "len" can be modified
in the COW case, which releases fewer outstanding extents than expected.

Fix it by calling btrfs_delalloc_release_extents() for the original length.

To reproduce the warning, the filesystem should be 1 GiB.  It's
triggering a short-write, due to not being able to allocate a large
extent and instead allocating a smaller one.

  WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
  Modules linked in: btrfs blake2b_generic xor lzo_compress
  lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
  zsmalloc
  CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
  RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
  RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
  RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
  RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
  RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
  R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
  R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
  FS:  00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   destroy_inode+0x33/0x70
   dispose_list+0x43/0x60
   evict_inodes+0x161/0x1b0
   generic_shutdown_super+0x2d/0x110
   kill_anon_super+0xf/0x20
   btrfs_kill_super+0xd/0x20 [btrfs]
   deactivate_locked_super+0x27/0x90
   cleanup_mnt+0x12c/0x180
   task_work_run+0x54/0x80
   exit_to_user_mode_prepare+0x152/0x160
   syscall_exit_to_user_mode+0x12/0x30
   do_syscall_64+0x42/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae
   RIP: 0033:0x7f854a000fb7

Fixes: f0bfa76 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
CC: stable@vger.kernel.org # 5.17
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xukuohai pushed a commit to xukuohai/linux-raspberry-pi that referenced this issue May 9, 2022
Add bpf trampoline support for arm64. Most of the logic is the same as
x86.

fentry before bpf trampoline hooked:
 mov x9, x30
 nop

fentry after bpf trampoline hooked:
 mov x9, x30
 bl  <bpf_trampoline>

Tested on qemu, result:
 raspberrypi#18  bpf_tcp_ca:OK
 raspberrypi#51  dummy_st_ops:OK
 raspberrypi#55  fentry_fexit:OK
 raspberrypi#56  fentry_test:OK
 raspberrypi#57  fexit_bpf2bpf:OK
 raspberrypi#58  fexit_sleep:OK
 raspberrypi#59  fexit_stress:OK
 raspberrypi#60  fexit_test:OK
 raspberrypi#67  get_func_args_test:OK
 raspberrypi#68  get_func_ip_test:OK
 raspberrypi#101 modify_return:OK
 raspberrypi#233 xdp_bpf2bpf:OK

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Acked-by: Song Liu <songliubraving@fb.com>
xukuohai pushed a commit to xukuohai/linux-raspberry-pi that referenced this issue May 12, 2022
Add bpf trampoline support for arm64. Most of the logic is the same as
x86.

fentry before bpf trampoline hooked:
 mov x9, x30
 nop

fentry after bpf trampoline hooked:
 mov x9, x30
 bl  <bpf_trampoline>

Tested on qemu, result:
 raspberrypi#18  bpf_tcp_ca:OK
 raspberrypi#51  dummy_st_ops:OK
 raspberrypi#55  fentry_fexit:OK
 raspberrypi#56  fentry_test:OK
 raspberrypi#57  fexit_bpf2bpf:OK
 raspberrypi#58  fexit_sleep:OK
 raspberrypi#59  fexit_stress:OK
 raspberrypi#60  fexit_test:OK
 raspberrypi#67  get_func_args_test:OK
 raspberrypi#68  get_func_ip_test:OK
 raspberrypi#101 modify_return:OK
 raspberrypi#233 xdp_bpf2bpf:OK

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Acked-by: Song Liu <songliubraving@fb.com>
popcornmix pushed a commit that referenced this issue Nov 7, 2022
For leaf dir, In most cases, there should be as many bestfree slots
as the dir data blocks that can fit under i_size (except for [1]).

Root cause is we don't examin the number bestfree slots, when the slots
number less than dir data blocks, if we need to allocate new dir data
block and update the bestfree array, we will use the dir block number as
index to assign bestfree array, while we did not check the leaf buf
boundary which may cause UAF or other memory access problem. This issue
can also triggered with test cases xfs/473 from fstests.

According to Dave Chinner & Darrick's suggestion, adding buffer verifier
to detect this abnormal situation in time.
Simplify the testcase for fstest xfs/554 [1]

The error log is shown as follows:
==================================================================
BUG: KASAN: use-after-free in xfs_dir2_leaf_addname+0x1995/0x1ac0
Write of size 2 at addr ffff88810168b000 by task touch/1552
CPU: 5 PID: 1552 Comm: touch Not tainted 6.0.0-rc3+ #101
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report.cold+0xf6/0x691
 kasan_report+0xa8/0x120
 xfs_dir2_leaf_addname+0x1995/0x1ac0
 xfs_dir_createname+0x58c/0x7f0
 xfs_create+0x7af/0x1010
 xfs_generic_create+0x270/0x5e0
 path_openat+0x270b/0x3450
 do_filp_open+0x1cf/0x2b0
 do_sys_openat2+0x46b/0x7a0
 do_sys_open+0xb7/0x130
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fe4d9e9312b
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
RDX: 0000000000000941 RSI: 00007ffda4c17f33 RDI: 00000000ffffff9c
RBP: 00007ffda4c17f33 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f33 R15: 0000000000000000
 </TASK>

The buggy address belongs to the physical page:
page:ffffea000405a2c0 refcount:0 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0x10168b
flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff)
raw: 002fffff80000000 ffffea0004057788 ffffea000402dbc8 0000000000000000
raw: 0000000000000000 0000000000170000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88810168af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88810168af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff88810168b000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                   ^
 ffff88810168b080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88810168b100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
Disabling lock debugging due to kernel taint
00000000: 58 44 44 33 5b 53 35 c2 00 00 00 00 00 00 00 78
XDD3[S5........x
XFS (sdb): Internal error xfs_dir2_data_use_free at line 1200 of file
fs/xfs/libxfs/xfs_dir2_data.c.  Caller
xfs_dir2_data_use_free+0x28a/0xeb0
CPU: 5 PID: 1552 Comm: touch Tainted: G    B              6.0.0-rc3+
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 xfs_corruption_error+0x132/0x150
 xfs_dir2_data_use_free+0x198/0xeb0
 xfs_dir2_leaf_addname+0xa59/0x1ac0
 xfs_dir_createname+0x58c/0x7f0
 xfs_create+0x7af/0x1010
 xfs_generic_create+0x270/0x5e0
 path_openat+0x270b/0x3450
 do_filp_open+0x1cf/0x2b0
 do_sys_openat2+0x46b/0x7a0
 do_sys_open+0xb7/0x130
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fe4d9e9312b
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
RDX: 0000000000000941 RSI: 00007ffda4c17f46 RDI: 00000000ffffff9c
RBP: 00007ffda4c17f46 R08: 0000000000000000 R09: 0000000000000001
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f46 R15: 0000000000000000
 </TASK>
XFS (sdb): Corruption detected. Unmount and run xfs_repair

[1] https://lore.kernel.org/all/20220928095355.2074025-1-guoxuenan@huawei.com/
Reviewed-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
popcornmix pushed a commit that referenced this issue Jan 11, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 #101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Jan 11, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 #101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Jan 22, 2024
On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 #101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
0lxb pushed a commit to 0lxb/rpi_linux that referenced this issue Jan 30, 2024
sched_ext: fix race in scx_move_task() with exiting tasks
comeillfoo pushed a commit to comeillfoo/linux-rpi that referenced this issue Feb 3, 2024
commit 1036f69 upstream.

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
deferral of the vqmmc-supply regulator:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 raspberrypi#101
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __run_timers.part.0+0x1d0/0x1e8
     ra : __run_timers.part.0+0x134/0x1e8
    epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
     gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
     t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
     s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
     a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
     a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
     s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
     s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
     s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
     s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
     t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
    status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
    [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
    [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
    [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
    [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
    [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
    [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
    ---[ end trace 0000000000000000 ]---

What happens?

    renesas_sdhi_probe()
    {
    	tmio_mmc_host_alloc()
	    mmc_alloc_host()
		INIT_DELAYED_WORK(&host->detect, mmc_rescan);

	devm_request_irq(tmio_mmc_irq);

	/*
	 * After this, the interrupt handler may be invoked at any time
	 *
	 *  tmio_mmc_irq()
	 *  {
	 *	__tmio_mmc_card_detect_irq()
	 *	    mmc_detect_change()
	 *		_mmc_detect_change()
	 *		    mmc_schedule_delayed_work(&host->detect, delay);
	 *  }
	 */

	tmio_mmc_host_probe()
	    tmio_mmc_init_ocr()
		-EPROBE_DEFER

	tmio_mmc_host_free()
	    mmc_free_host()
    }

When expire_timers() runs later, it warns because the MMC host structure
containing the delayed work was freed, and now contains an invalid work
function pointer.

Fix this by cancelling any pending delayed work before releasing the
MMC host structure.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Apr 16, 2024
[ Upstream commit 601429c ]

Why:
    The PCI error slot reset maybe triggered after inject ue to UMC multi times, this
    caused system hang.
    [  557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume
    [  557.373718] [drm] PCIE GART of 512M enabled.
    [  557.373722] [drm] PTB located at 0x0000031FED700000
    [  557.373788] [drm] VRAM is lost due to GPU reset!
    [  557.373789] [drm] PSP is resuming...
    [  557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset
    [  557.547067] [drm] PCI error: detected callback, state(1)!!
    [  557.547069] [drm] No support for XGMI hive yet...
    [  557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter
    [  557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations
    [  557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered
    [  557.610492] [drm] PCI error: slot reset callback!!
    ...
    [  560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded!
    [  560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI
    [  560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G           OE     5.15.0-91-generic #101-Ubuntu
    [  560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023
    [  560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu]
    [  560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00
    [  560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202
    [  560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0
    [  560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010
    [  560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08
    [  560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000
    [  560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000
    [  560.803889] FS:  0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000
    [  560.812973] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0
    [  560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
    [  560.843444] PKRU: 55555554
    [  560.846480] Call Trace:
    [  560.849225]  <TASK>
    [  560.851580]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.856488]  ? show_trace_log_lvl+0x1d6/0x2ea
    [  560.861379]  ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.867778]  ? show_regs.part.0+0x23/0x29
    [  560.872293]  ? __die_body.cold+0x8/0xd
    [  560.876502]  ? die_addr+0x3e/0x60
    [  560.880238]  ? exc_general_protection+0x1c5/0x410
    [  560.885532]  ? asm_exc_general_protection+0x27/0x30
    [  560.891025]  ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
    [  560.898323]  amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
    [  560.904520]  process_one_work+0x228/0x3d0
How:
    In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected
    all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant