Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dwc_otg: more robust interrupt handling #330

Merged
merged 2 commits into from
Jul 16, 2013

Conversation

P33M
Copy link
Contributor

@P33M P33M commented Jul 13, 2013

This PR fixes issues such as the error message

ERROR::handle_hc_chhltd_intr_dma:2537: handle_hc_chhltd_intr_dma: Channel 7, DMA Mode -- ChHltd set, but reason for halting is unknown, hcint 0x00000002, intsts 0x04400001

popping up in the log since
db4fad7.
I would prefer a round of testing by people that have seen this issue before merging for general consumption, I will advertise as necessary in the relevant places.

P33M added 2 commits July 13, 2013 20:41
The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.
In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.
@P33M
Copy link
Contributor Author

P33M commented Jul 14, 2013

I can say that this patch definitely squashes the issues that I've seen at least - I can artificially create endpoints that are very lossy (via the world's worst 10-port hub and several pendrives) which completely breaks USB with plenty of the above error message, and will continuously reset/probe/error/reset devices (as it should do for dodgy connections) after the patch.

@popcornmix
Copy link
Collaborator

I'm testing now, and no problems so far. I'll keep you updated.

@popcornmix
Copy link
Collaborator

I just got this:

<4>[17013.296896] ERROR::handle_hc_chhltd_intr_dma:2574: handle_hc_chhltd_intr_dma: Channel 6, DMA1
<5>[17024.229527] nfs: server 10.177.71.57 not responding, still trying
<5>[17059.541002] nfs: server 10.177.71.57 not responding, still trying

http://pastebin.com/NNB78K9K

$ uname -a
Linux raspberrypi 3.6.11+ #492 PREEMPT Sun Jul 14 00:04:59 BST 2013 armv6l GNU/Linux

I'd booted from nfs and had xbmc running in the background, but idle.

I also get these failures without this PR.

@P33M
Copy link
Contributor Author

P33M commented Jul 15, 2013

The pastebin chopped off the registers in that error which is the important bit. Can you still get at the message log?

How frequent are these errors now vs before?
There's also a bluetooth dongle in there - can I get a lsusb -v for that. It may be a full-speed device which could result in interesting things for the same sort of error happening on split transactions.

@popcornmix
Copy link
Collaborator

No sign of "handle_hc_chhltd_intr_dma" in /var/log/*
I'll catch it next time.

I would say it used to occur every couple of hours (with work setup).
It's happened once today (although it's not been heavily used). No evidence that it is worse.

No bluetooth - this is what I had (and still have) connected:

$ lsusb -v

Bus 001 Device 002: ID 0424:9512 Standard Microsystems Corp.
Couldn't open device, some information will be missing
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         2 TT per port
  bMaxPacketSize0        64
  idVendor           0x0424 Standard Microsystems Corp.
  idProduct          0x9512
  bcdDevice            2.00
  iManufacturer           0
  iProduct                0
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           41
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                2mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      1 Single TT
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       1
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      2 TT per port
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12

Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Couldn't open device, some information will be missing
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0002 2.0 root hub
  bcdDevice            3.06
  iManufacturer           3
  iProduct                2
  iSerial                 1
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           25
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12

Bus 001 Device 003: ID 0424:ec00 Standard Microsystems Corp.
Couldn't open device, some information will be missing
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass         0
  bDeviceProtocol         1
  bMaxPacketSize0        64
  idVendor           0x0424 Standard Microsystems Corp.
  idProduct          0xec00
  bcdDevice            2.00
  iManufacturer           0
  iProduct                0
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           39
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                2mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass      0
      bInterfaceProtocol    255
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0010  1x 16 bytes
        bInterval               4

Bus 001 Device 004: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Couldn't open device, some information will be missing
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         2 TT per port
  bMaxPacketSize0        64
  idVendor           0x0424 Standard Microsystems Corp.
  idProduct          0x2514 USB 2.0 Hub
  bcdDevice            0.00
  iManufacturer           0
  iProduct                0
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           41
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                2mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      1 Single TT
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       1
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      2 TT per port
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12

Bus 001 Device 005: ID 413c:2107 Dell Computer Corp.
Couldn't open device, some information will be missing
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0         8
  idVendor           0x413c Dell Computer Corp.
  idProduct          0x2107
  bcdDevice            5.02
  iManufacturer           1
  iProduct                2
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           34
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xa0
      (Bus Powered)
      Remote Wakeup
    MaxPower               70mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      1 Boot Interface Subclass
      bInterfaceProtocol      1 Keyboard
      iInterface              0
        HID Device Descriptor:
          bLength                 9
          bDescriptorType        33
          bcdHID               1.10
          bCountryCode            0 Not supported
          bNumDescriptors         1
          bDescriptorType        34 Report
          wDescriptorLength      65
         Report Descriptors:
           ** UNAVAILABLE **
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0008  1x 8 bytes
        bInterval              10

@P33M
Copy link
Contributor Author

P33M commented Jul 15, 2013

The only alternative could be something similar happening with a split transaction.

I did specifically test this before and after with a FS device known to produce 2 data toggle errors on the first transaction after a reset, result was negative for this error before and after which led me to believe that split transactions weren't affected. Data toggle errors halt the channel in split transactions and they are now handled exactly like HS transaction errors, so I used this as an example repeatable test.

@P33M
Copy link
Contributor Author

P33M commented Jul 16, 2013

3 positives from forum users:

http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=49223&start=25
http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=49902

I have now tested split transactions using the same tactic as I generated multiple errors for USB2.0 devices - I have not seen any FS/LS devices cause the error condition to appear.

@popcornmix
Copy link
Collaborator

I've not had the handle_hc_chhltd_intr_dma message repeat.
I believe I was seeing the issue more frequently without the patch.
I'm seeing no new problems.

I'm happy to accept this PR, and produce updated firmware for wider testing.
Agree?

@ghollingworth
Copy link

Yep,

It's clearly an improvement

Gordon

popcornmix added a commit that referenced this pull request Jul 16, 2013
dwc_otg: more robust interrupt handling
@popcornmix popcornmix merged commit 30500d9 into raspberrypi:rpi-3.6.y Jul 16, 2013
@popcornmix
Copy link
Collaborator

I caught it again:

<4>[  200.089014] ERROR::handle_hc_chhltd_intr_dma:2579: handle_hc_chhltd_intr_dma: Channel 4, DMA Mode -- ChHltd set, but reason for halting is unknown, hcint 0x00000002, intsts 0x04600021
<5>[  209.324850] nfs: server 192.168.4.41 not responding, still trying
<5>[  213.734972] nfs: server 192.168.4.41 not responding, still trying
<5>[  218.145112] nfs: server 192.168.4.41 not responding, still trying
<5>[  222.555243] nfs: server 192.168.4.41 not responding, still trying
<5>[  226.965369] nfs: server 192.168.4.41 not responding, still trying
<5>[  231.375507] nfs: server 192.168.4.41 not responding, still trying
<6>[  237.180411] SysRq : DEBUG

Running on latest "next" kernel:

Linux version 3.9.10+ (dc4@dc4-arm-01) (gcc version 4.7.2 20120731 (prerelease) (crosstool-NG linaro-1.13.1+bzr2458 - Linaro GCC 2012.08) ) #508 PREEMPT Fri Jul 19 20:55:21 BST 2013

@P33M
Copy link
Contributor Author

P33M commented Jul 20, 2013

Hm

I suspect that what's happening in this case is the delay between ack/xfercomp and chhltd is becoming large. I suspect that the dwc DMA controller is having a lot of work to do because of an NFS root which delays the halt - provoking the race between the FIQ and the IRQ.

Given that the bogus channel halt isn't really a disaster, I should probably add a retry mechanism similar to what I did for split data toggle errors...

popcornmix pushed a commit that referenced this pull request Jun 8, 2014
…iable declaration

Networking prefers this style, so warn when it's not used.

Networking uses:

    void foo(int bar)
    {
        int baz;

        code...
   }

not

    void foo(int bar)
    {
        int baz;
        code...
    }

There are a limited number of false positives when using macros to
declare variables like:

  WARNING: networking uses a blank line after declarations
  #330: FILE: net/ipv4/inet_hashtables.c:330:
  +	int dif = sk->sk_bound_dev_if;
  +	INET_ADDR_COOKIE(acookie, saddr, daddr)

Signed-off-by: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
chinasaur pushed a commit to chinasaur/firmware that referenced this pull request Aug 3, 2014
See: raspberrypi/linux#330

kernel: fix for Oops in dwc_otg_dump_spram.
See raspberrypi#21

firmware: fix, part 2, for audo sync when pcm decoder combines multiple packets.
See http://forum.xbmc.org/showthread.php?tid=140518&pid=1463506#pid1463506
giraldeau pushed a commit to giraldeau/linux that referenced this pull request Apr 12, 2016
With completion using swait and so rawlocks we don't need this anymore.
Further, bisect thinks this patch is responsible for:

|BUG: unable to handle kernel NULL pointer dereference at           (null)
|IP: [<ffffffff81082123>] sched_cpu_active+0x53/0x70
|PGD 0
|Oops: 0000 [raspberrypi#1] PREEMPT SMP DEBUG_PAGEALLOC
|Dumping ftrace buffer:
|   (ftrace buffer empty)
|Modules linked in:
|CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.1+ raspberrypi#330
|Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014
|task: ffff88013ae64b00 ti: ffff88013ae74000 task.ti: ffff88013ae74000
|RIP: 0010:[<ffffffff81082123>]  [<ffffffff81082123>] sched_cpu_active+0x53/0x70
|RSP: 0000:ffff88013ae77eb8  EFLAGS: 00010082
|RAX: 0000000000000001 RBX: ffffffff81c2cf20 RCX: 0000001050fb52fb
|RDX: 0000001050fb52fb RSI: 000000105117ca1e RDI: 00000000001c7723
|RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
|R10: 0000000000000000 R11: 0000000000000001 R12: 00000000ffffffff
|R13: ffffffff81c2cee0 R14: 0000000000000000 R15: 0000000000000001
|FS:  0000000000000000(0000) GS:ffff88013b200000(0000) knlGS:0000000000000000
|CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
|CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000006e0
|Stack:
| ffffffff810c446d ffff88013ae77f00 ffffffff8107d8dd 000000000000000a
| 0000000000000001 0000000000000000 0000000000000000 0000000000000000
| 0000000000000000 ffff88013ae77f10 ffffffff8107d90e ffff88013ae77f20
|Call Trace:
| [<ffffffff810c446d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
| [<ffffffff8107d8dd>] ? notifier_call_chain+0x5d/0x80
| [<ffffffff8107d90e>] ? __raw_notifier_call_chain+0xe/0x10
| [<ffffffff810598a3>] ? cpu_notify+0x23/0x40
| [<ffffffff8105a7b8>] ? notify_cpu_starting+0x28/0x30

during hotplug. The rawlocks need to remain however.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
mlyle pushed a commit to d-ronin/linux that referenced this pull request Aug 21, 2016
With completion using swait and so rawlocks we don't need this anymore.
Further, bisect thinks this patch is responsible for:

|BUG: unable to handle kernel NULL pointer dereference at           (null)
|IP: [<ffffffff81082123>] sched_cpu_active+0x53/0x70
|PGD 0
|Oops: 0000 [raspberrypi#1] PREEMPT SMP DEBUG_PAGEALLOC
|Dumping ftrace buffer:
|   (ftrace buffer empty)
|Modules linked in:
|CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.1+ raspberrypi#330
|Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014
|task: ffff88013ae64b00 ti: ffff88013ae74000 task.ti: ffff88013ae74000
|RIP: 0010:[<ffffffff81082123>]  [<ffffffff81082123>] sched_cpu_active+0x53/0x70
|RSP: 0000:ffff88013ae77eb8  EFLAGS: 00010082
|RAX: 0000000000000001 RBX: ffffffff81c2cf20 RCX: 0000001050fb52fb
|RDX: 0000001050fb52fb RSI: 000000105117ca1e RDI: 00000000001c7723
|RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
|R10: 0000000000000000 R11: 0000000000000001 R12: 00000000ffffffff
|R13: ffffffff81c2cee0 R14: 0000000000000000 R15: 0000000000000001
|FS:  0000000000000000(0000) GS:ffff88013b200000(0000) knlGS:0000000000000000
|CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
|CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000006e0
|Stack:
| ffffffff810c446d ffff88013ae77f00 ffffffff8107d8dd 000000000000000a
| 0000000000000001 0000000000000000 0000000000000000 0000000000000000
| 0000000000000000 ffff88013ae77f10 ffffffff8107d90e ffff88013ae77f20
|Call Trace:
| [<ffffffff810c446d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
| [<ffffffff8107d8dd>] ? notifier_call_chain+0x5d/0x80
| [<ffffffff8107d90e>] ? __raw_notifier_call_chain+0xe/0x10
| [<ffffffff810598a3>] ? cpu_notify+0x23/0x40
| [<ffffffff8105a7b8>] ? notify_cpu_starting+0x28/0x30

during hotplug. The rawlocks need to remain however.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
neuschaefer pushed a commit to neuschaefer/raspi-binary-firmware that referenced this pull request Feb 27, 2017
See: raspberrypi/linux#330

kernel: fix for Oops in dwc_otg_dump_spram.
See raspberrypi#21

firmware: fix, part 2, for audo sync when pcm decoder combines multiple packets.
See http://forum.xbmc.org/showthread.php?tid=140518&pid=1463506#pid1463506
anholt pushed a commit to anholt/linux that referenced this pull request Apr 8, 2017
Geminilake can output two pixels per clock, and that affects the maximum
scaling factor for its scalers. Take that into account and avoid the
following warning:

WARNING: CPU: 1 PID: 593 at drivers/gpu/drm/i915/intel_display.c:13223 skl_max_scale.part.129+0x78/0x80 [i915]
WARN_ON_ONCE(!crtc_clock || cdclk < crtc_clock)
Modules linked in: x86_pkg_temp_thermal i915 coretemp kvm_intel kvm i2c_algo_bit drm_kms_helper irqbypass crct10dif_pclmul prime_numbers crc32_pclmul drm ghash_clmulni_intel shpchp tpm_tis tpm_tis_core tpm nfsd authw
CPU: 1 PID: 593 Comm: kworker/u8:3 Tainted: G        W       4.10.0-rc8ander+ raspberrypi#330
Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0035.B33.1702150552 02/15/2017
Workqueue: events_unbound async_run_entry_fn
Call Trace:
 dump_stack+0x86/0xc3
 __warn+0xcb/0xf0
 warn_slowpath_fmt+0x5f/0x80
 skl_max_scale.part.129+0x78/0x80 [i915]
 intel_check_primary_plane+0xa6/0xc0 [i915]
 intel_plane_atomic_check_with_state+0xd1/0x1a0 [i915]
 ? drm_printk+0xb5/0xc0 [drm]
 intel_plane_atomic_check+0x3d/0x80 [i915]
 drm_atomic_helper_check_planes+0x7c/0x200 [drm_kms_helper]
 intel_atomic_check+0xa5b/0x11a0 [i915]
 drm_atomic_check_only+0x353/0x600 [drm]
 ? drm_atomic_add_affected_connectors+0x10c/0x120 [drm]
 drm_atomic_commit+0x18/0x50 [drm]
 restore_fbdev_mode+0x14c/0x2a0 [drm_kms_helper]
 drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper]
 drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper]
 intel_fbdev_set_par+0x1a/0x70 [i915]
 fbcon_init+0x582/0x610
 visual_init+0xd6/0x130
 do_bind_con_driver+0x1da/0x3c0
 do_take_over_console+0x116/0x180
 do_fbcon_takeover+0x5c/0xb0
 fbcon_event_notify+0x772/0x8a0
 ? __blocking_notifier_call_chain+0x35/0x70
 notifier_call_chain+0x4a/0x70
 __blocking_notifier_call_chain+0x4d/0x70
 blocking_notifier_call_chain+0x16/0x20
 fb_notifier_call_chain+0x1b/0x20
 register_framebuffer+0x278/0x360
 drm_fb_helper_initial_config+0x253/0x440 [drm_kms_helper]
 intel_fbdev_initial_config+0x18/0x30 [i915]
 async_run_entry_fn+0x39/0x170
 process_one_work+0x212/0x670
 ? process_one_work+0x197/0x670
 worker_thread+0x4e/0x490
 kthread+0x101/0x140
 ? process_one_work+0x670/0x670
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x31/0x40

v2: s/max_pixclk/max_dotclk/ (Ville)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170223071600.14356-3-ander.conselvan.de.oliveira@intel.com
popcornmix pushed a commit that referenced this pull request Oct 4, 2024
commit 44d1745 upstream.

Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock
on x86 due to a chain of locks and SRCU synchronizations.  Translating the
below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on
CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the
fairness of r/w semaphores).

    CPU0                     CPU1                     CPU2
1   lock(&kvm->slots_lock);
2                                                     lock(&vcpu->mutex);
3                                                     lock(&kvm->srcu);
4                            lock(cpu_hotplug_lock);
5                            lock(kvm_lock);
6                            lock(&kvm->slots_lock);
7                                                     lock(cpu_hotplug_lock);
8   sync(&kvm->srcu);

Note, there are likely more potential deadlocks in KVM x86, e.g. the same
pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with
__kvmclock_cpufreq_notifier():

  cpuhp_cpufreq_online()
  |
  -> cpufreq_online()
     |
     -> cpufreq_gov_performance_limits()
        |
        -> __cpufreq_driver_target()
           |
           -> __target_index()
              |
              -> cpufreq_freq_transition_begin()
                 |
                 -> cpufreq_notify_transition()
                    |
                    -> ... __kvmclock_cpufreq_notifier()

But, actually triggering such deadlocks is beyond rare due to the
combination of dependencies and timings involved.  E.g. the cpufreq
notifier is only used on older CPUs without a constant TSC, mucking with
the NX hugepage mitigation while VMs are running is very uncommon, and
doing so while also onlining/offlining a CPU (necessary to generate
contention on cpu_hotplug_lock) would be even more unusual.

The most robust solution to the general cpu_hotplug_lock issue is likely
to switch vm_list to be an RCU-protected list, e.g. so that x86's cpufreq
notifier doesn't to take kvm_lock.  For now, settle for fixing the most
blatant deadlock, as switching to an RCU-protected list is a much more
involved change, but add a comment in locking.rst to call out that care
needs to be taken when walking holding kvm_lock and walking vm_list.

  ======================================================
  WARNING: possible circular locking dependency detected
  6.10.0-smp--c257535a0c9d-pip #330 Tainted: G S         O
  ------------------------------------------------------
  tee/35048 is trying to acquire lock:
  ff6a80eced71e0a8 (&kvm->slots_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x179/0x1e0 [kvm]

  but task is already holding lock:
  ffffffffc07abb08 (kvm_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x14a/0x1e0 [kvm]

  which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

  -> #3 (kvm_lock){+.+.}-{3:3}:
         __mutex_lock+0x6a/0xb40
         mutex_lock_nested+0x1f/0x30
         kvm_dev_ioctl+0x4fb/0xe50 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #2 (cpu_hotplug_lock){++++}-{0:0}:
         cpus_read_lock+0x2e/0xb0
         static_key_slow_inc+0x16/0x30
         kvm_lapic_set_base+0x6a/0x1c0 [kvm]
         kvm_set_apic_base+0x8f/0xe0 [kvm]
         kvm_set_msr_common+0x9ae/0xf80 [kvm]
         vmx_set_msr+0xa54/0xbe0 [kvm_intel]
         __kvm_set_msr+0xb6/0x1a0 [kvm]
         kvm_arch_vcpu_ioctl+0xeca/0x10c0 [kvm]
         kvm_vcpu_ioctl+0x485/0x5b0 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #1 (&kvm->srcu){.+.+}-{0:0}:
         __synchronize_srcu+0x44/0x1a0
         synchronize_srcu_expedited+0x21/0x30
         kvm_swap_active_memslots+0x110/0x1c0 [kvm]
         kvm_set_memslot+0x360/0x620 [kvm]
         __kvm_set_memory_region+0x27b/0x300 [kvm]
         kvm_vm_ioctl_set_memory_region+0x43/0x60 [kvm]
         kvm_vm_ioctl+0x295/0x650 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #0 (&kvm->slots_lock){+.+.}-{3:3}:
         __lock_acquire+0x15ef/0x2e30
         lock_acquire+0xe0/0x260
         __mutex_lock+0x6a/0xb40
         mutex_lock_nested+0x1f/0x30
         set_nx_huge_pages+0x179/0x1e0 [kvm]
         param_attr_store+0x93/0x100
         module_attr_store+0x22/0x40
         sysfs_kf_write+0x81/0xb0
         kernfs_fop_write_iter+0x133/0x1d0
         vfs_write+0x28d/0x380
         ksys_write+0x70/0xe0
         __x64_sys_write+0x1f/0x30
         x64_sys_call+0x281b/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

Cc: Chao Gao <chao.gao@intel.com>
Fixes: 0bf5049 ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock")
Cc: stable@vger.kernel.org
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240830043600.127750-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this pull request Oct 10, 2024
commit 44d1745 upstream.

Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock
on x86 due to a chain of locks and SRCU synchronizations.  Translating the
below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on
CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the
fairness of r/w semaphores).

    CPU0                     CPU1                     CPU2
1   lock(&kvm->slots_lock);
2                                                     lock(&vcpu->mutex);
3                                                     lock(&kvm->srcu);
4                            lock(cpu_hotplug_lock);
5                            lock(kvm_lock);
6                            lock(&kvm->slots_lock);
7                                                     lock(cpu_hotplug_lock);
8   sync(&kvm->srcu);

Note, there are likely more potential deadlocks in KVM x86, e.g. the same
pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with
__kvmclock_cpufreq_notifier():

  cpuhp_cpufreq_online()
  |
  -> cpufreq_online()
     |
     -> cpufreq_gov_performance_limits()
        |
        -> __cpufreq_driver_target()
           |
           -> __target_index()
              |
              -> cpufreq_freq_transition_begin()
                 |
                 -> cpufreq_notify_transition()
                    |
                    -> ... __kvmclock_cpufreq_notifier()

But, actually triggering such deadlocks is beyond rare due to the
combination of dependencies and timings involved.  E.g. the cpufreq
notifier is only used on older CPUs without a constant TSC, mucking with
the NX hugepage mitigation while VMs are running is very uncommon, and
doing so while also onlining/offlining a CPU (necessary to generate
contention on cpu_hotplug_lock) would be even more unusual.

The most robust solution to the general cpu_hotplug_lock issue is likely
to switch vm_list to be an RCU-protected list, e.g. so that x86's cpufreq
notifier doesn't to take kvm_lock.  For now, settle for fixing the most
blatant deadlock, as switching to an RCU-protected list is a much more
involved change, but add a comment in locking.rst to call out that care
needs to be taken when walking holding kvm_lock and walking vm_list.

  ======================================================
  WARNING: possible circular locking dependency detected
  6.10.0-smp--c257535a0c9d-pip #330 Tainted: G S         O
  ------------------------------------------------------
  tee/35048 is trying to acquire lock:
  ff6a80eced71e0a8 (&kvm->slots_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x179/0x1e0 [kvm]

  but task is already holding lock:
  ffffffffc07abb08 (kvm_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x14a/0x1e0 [kvm]

  which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

  -> #3 (kvm_lock){+.+.}-{3:3}:
         __mutex_lock+0x6a/0xb40
         mutex_lock_nested+0x1f/0x30
         kvm_dev_ioctl+0x4fb/0xe50 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #2 (cpu_hotplug_lock){++++}-{0:0}:
         cpus_read_lock+0x2e/0xb0
         static_key_slow_inc+0x16/0x30
         kvm_lapic_set_base+0x6a/0x1c0 [kvm]
         kvm_set_apic_base+0x8f/0xe0 [kvm]
         kvm_set_msr_common+0x9ae/0xf80 [kvm]
         vmx_set_msr+0xa54/0xbe0 [kvm_intel]
         __kvm_set_msr+0xb6/0x1a0 [kvm]
         kvm_arch_vcpu_ioctl+0xeca/0x10c0 [kvm]
         kvm_vcpu_ioctl+0x485/0x5b0 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #1 (&kvm->srcu){.+.+}-{0:0}:
         __synchronize_srcu+0x44/0x1a0
         synchronize_srcu_expedited+0x21/0x30
         kvm_swap_active_memslots+0x110/0x1c0 [kvm]
         kvm_set_memslot+0x360/0x620 [kvm]
         __kvm_set_memory_region+0x27b/0x300 [kvm]
         kvm_vm_ioctl_set_memory_region+0x43/0x60 [kvm]
         kvm_vm_ioctl+0x295/0x650 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #0 (&kvm->slots_lock){+.+.}-{3:3}:
         __lock_acquire+0x15ef/0x2e30
         lock_acquire+0xe0/0x260
         __mutex_lock+0x6a/0xb40
         mutex_lock_nested+0x1f/0x30
         set_nx_huge_pages+0x179/0x1e0 [kvm]
         param_attr_store+0x93/0x100
         module_attr_store+0x22/0x40
         sysfs_kf_write+0x81/0xb0
         kernfs_fop_write_iter+0x133/0x1d0
         vfs_write+0x28d/0x380
         ksys_write+0x70/0xe0
         __x64_sys_write+0x1f/0x30
         x64_sys_call+0x281b/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

Cc: Chao Gao <chao.gao@intel.com>
Fixes: 0bf5049 ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock")
Cc: stable@vger.kernel.org
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240830043600.127750-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this pull request Oct 10, 2024
commit 44d1745 upstream.

Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock
on x86 due to a chain of locks and SRCU synchronizations.  Translating the
below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on
CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the
fairness of r/w semaphores).

    CPU0                     CPU1                     CPU2
1   lock(&kvm->slots_lock);
2                                                     lock(&vcpu->mutex);
3                                                     lock(&kvm->srcu);
4                            lock(cpu_hotplug_lock);
5                            lock(kvm_lock);
6                            lock(&kvm->slots_lock);
7                                                     lock(cpu_hotplug_lock);
8   sync(&kvm->srcu);

Note, there are likely more potential deadlocks in KVM x86, e.g. the same
pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with
__kvmclock_cpufreq_notifier():

  cpuhp_cpufreq_online()
  |
  -> cpufreq_online()
     |
     -> cpufreq_gov_performance_limits()
        |
        -> __cpufreq_driver_target()
           |
           -> __target_index()
              |
              -> cpufreq_freq_transition_begin()
                 |
                 -> cpufreq_notify_transition()
                    |
                    -> ... __kvmclock_cpufreq_notifier()

But, actually triggering such deadlocks is beyond rare due to the
combination of dependencies and timings involved.  E.g. the cpufreq
notifier is only used on older CPUs without a constant TSC, mucking with
the NX hugepage mitigation while VMs are running is very uncommon, and
doing so while also onlining/offlining a CPU (necessary to generate
contention on cpu_hotplug_lock) would be even more unusual.

The most robust solution to the general cpu_hotplug_lock issue is likely
to switch vm_list to be an RCU-protected list, e.g. so that x86's cpufreq
notifier doesn't to take kvm_lock.  For now, settle for fixing the most
blatant deadlock, as switching to an RCU-protected list is a much more
involved change, but add a comment in locking.rst to call out that care
needs to be taken when walking holding kvm_lock and walking vm_list.

  ======================================================
  WARNING: possible circular locking dependency detected
  6.10.0-smp--c257535a0c9d-pip #330 Tainted: G S         O
  ------------------------------------------------------
  tee/35048 is trying to acquire lock:
  ff6a80eced71e0a8 (&kvm->slots_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x179/0x1e0 [kvm]

  but task is already holding lock:
  ffffffffc07abb08 (kvm_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x14a/0x1e0 [kvm]

  which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

  -> #3 (kvm_lock){+.+.}-{3:3}:
         __mutex_lock+0x6a/0xb40
         mutex_lock_nested+0x1f/0x30
         kvm_dev_ioctl+0x4fb/0xe50 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #2 (cpu_hotplug_lock){++++}-{0:0}:
         cpus_read_lock+0x2e/0xb0
         static_key_slow_inc+0x16/0x30
         kvm_lapic_set_base+0x6a/0x1c0 [kvm]
         kvm_set_apic_base+0x8f/0xe0 [kvm]
         kvm_set_msr_common+0x9ae/0xf80 [kvm]
         vmx_set_msr+0xa54/0xbe0 [kvm_intel]
         __kvm_set_msr+0xb6/0x1a0 [kvm]
         kvm_arch_vcpu_ioctl+0xeca/0x10c0 [kvm]
         kvm_vcpu_ioctl+0x485/0x5b0 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #1 (&kvm->srcu){.+.+}-{0:0}:
         __synchronize_srcu+0x44/0x1a0
         synchronize_srcu_expedited+0x21/0x30
         kvm_swap_active_memslots+0x110/0x1c0 [kvm]
         kvm_set_memslot+0x360/0x620 [kvm]
         __kvm_set_memory_region+0x27b/0x300 [kvm]
         kvm_vm_ioctl_set_memory_region+0x43/0x60 [kvm]
         kvm_vm_ioctl+0x295/0x650 [kvm]
         __se_sys_ioctl+0x7b/0xd0
         __x64_sys_ioctl+0x21/0x30
         x64_sys_call+0x15d0/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #0 (&kvm->slots_lock){+.+.}-{3:3}:
         __lock_acquire+0x15ef/0x2e30
         lock_acquire+0xe0/0x260
         __mutex_lock+0x6a/0xb40
         mutex_lock_nested+0x1f/0x30
         set_nx_huge_pages+0x179/0x1e0 [kvm]
         param_attr_store+0x93/0x100
         module_attr_store+0x22/0x40
         sysfs_kf_write+0x81/0xb0
         kernfs_fop_write_iter+0x133/0x1d0
         vfs_write+0x28d/0x380
         ksys_write+0x70/0xe0
         __x64_sys_write+0x1f/0x30
         x64_sys_call+0x281b/0x2e60
         do_syscall_64+0x83/0x160
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

Cc: Chao Gao <chao.gao@intel.com>
Fixes: 0bf5049 ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock")
Cc: stable@vger.kernel.org
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240830043600.127750-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants