Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid CRC in USB packets on Pi Zero #2024

Closed
yoctopuce opened this issue May 19, 2017 · 27 comments
Closed

Invalid CRC in USB packets on Pi Zero #2024

yoctopuce opened this issue May 19, 2017 · 27 comments

Comments

@yoctopuce
Copy link

We have found a bug in the USB stack of the Raspberry PI image.

Using an USB analyser we have found that sometime the Raspberry Pi Zero send USB packets with Invalid CRC. The strange thing is that this issue affects only Raspberry Pi Zero with Raspbian image after March 2016. The same image work perfectly on a Raspberry Pi 2 or 3.

We have publish on GitHub a complete description of the problem and a program that exhibits this issue:
https://github.com/yoctopuce-examples/raspberry_pi_zero_bug

@JamesH65
Copy link
Contributor

@P33M One for you I think?

@yoctopuce Can you confirm March 2016 as the date it appears to start going wrong.

@JamesH65 JamesH65 added the Waiting for internal comment Waiting for comment from a member of the Raspberry Pi engineering team label May 19, 2017
@P33M
Copy link
Contributor

P33M commented May 19, 2017

There's two ways to interpret the statement. Has the Pi Zero ever been reliable on any version of firmware? If so, what was the last good firmware revision/distro?

@lategoodbye
Copy link
Contributor

What is the behavior on a Raspberry Pi 1?
Is it possible to reproduce the issue without a Yoctopuce device?

@P33M
Copy link
Contributor

P33M commented May 22, 2017

To narrow down if a specific firmware version broke functionality, can you test firmware revisions retrieved via rpi-update:

https://github.com/Hexxeh/rpi-firmware/commits/master

Running sudo rpi-update <hash> will get the firmware revision corresponding to the hash in the repo above. A rudimentary pseudo-bisect should get to the first broken revision relatively quickly.

@yoctopuce
Copy link
Author

Hi all,

Sorry to have took so long to answers, but we are quite busy these days.

@JamesH65 Official image of 18 March 2016 is the last one that *IS WORKGING". Wee log bellow:

pi@raspberrypi:~/raspberry_pi_zero_bug $ sudo ./bug_pizero
this is a simple program that exhibit a bug in the USB stack
of the Raspberry PI Zero and any Yoctopuce device
Use libUSB v1.0.19.10903
List all Yoctopuce devices present on this host:
 - USB dev: RELAYHI1-85D28 (24E0:D:0)
1 Yoctopuce devices are present
\ Test device RELAYHI1-85D28 (24E0:D)
+ Send USB pkt no 0
- rd_callback for RELAYHI1-85D28 : response 0 is valid (len=64)
+ Send USB pkt no 1
- rd_callback for RELAYHI1-85D28 : response 1 is valid (len=64)
+ Send USB pkt no 2
- rd_callback for RELAYHI1-85D28 : response 2 is valid (len=64)
+ Send USB pkt no 3
- rd_callback for RELAYHI1-85D28 : response 3 is valid (len=64)
+ Send USB pkt no 4
- rd_callback for RELAYHI1-85D28 : response 4 is valid (len=64)
+ Send USB pkt no 5
- rd_callback for RELAYHI1-85D28 : response 5 is valid (len=64)
+ Send USB pkt no 6
- rd_callback for RELAYHI1-85D28 : response 6 is valid (len=64)
+ Send USB pkt no 7
- rd_callback for RELAYHI1-85D28 : response 7 is valid (len=64)
+ Send USB pkt no 8
- rd_callback for RELAYHI1-85D28 : response 8 is valid (len=64)
+ Send USB pkt no 9
- rd_callback for RELAYHI1-85D28 : response 9 is valid (len=64)
+ Send USB pkt no 10
- rd_callback for RELAYHI1-85D28 : response 10 is valid (len=64)
+ Send USB pkt no 11
- rd_callback for RELAYHI1-85D28 : response 11 is valid (len=64)
+ Send USB pkt no 12
- rd_callback for RELAYHI1-85D28 : response 12 is valid (len=64)
+ Send USB pkt no 13
- rd_callback for RELAYHI1-85D28 : response 13 is valid (len=64)
+ Send USB pkt no 14
- rd_callback for RELAYHI1-85D28 : response 14 is valid (len=64)
= result: 15 pkt sent vs 15 pkt received
Cancel and free all pending transactions
pi@raspberrypi:~/raspberry_pi_zero_bug $ uname -a
Linux raspberrypi 4.1.19+ #858 Tue Mar 15 15:52:03 GMT 2016 armv6l GNU/Linux

@lategoodbye: I will double check but the Raspberry Pi 1 does not have the same issue.
I've not been able to reproduce this issue without Yoctopuce devices. But if you need to do
some test we can send you an device or we can setup an remote access to our test bench.

I'm still running the remaining tests and should have all the result this evening.

@lategoodbye
Copy link
Contributor

@yoctopuce Thanks for for the offer, but i think it's to soon to send me test devices. Btw i'm not member of the Pi Foundation.

According to the output the good release was on March 15th not 18th. Could you please narrow down to the very last good firmware version as P33M suggested and also name the first bad firmware version?
Can you estimate in how many cases the CRC is invalid in case of a Raspberry Pi Zero?
Are you able to provide some invalid CRC values and their corresponding valid ones?
Did you try to use the dwc2 USB driver instead of dwc_otg?

@yoctopuce
Copy link
Author

Hi,

I've done some regression testing and I've found that the CRC issue has been injected the 11 April 2016 with
the following changelist: Hexxeh/rpi-firmware@c05b9e6

Before this the USB stack is working correctly.

@P33M
Copy link
Contributor

P33M commented May 26, 2017

It looks like this is the first 4.4 kernel we shipped. This will have a lot of kernel code churn which is going to be difficult to sort through

@yoctopuce can you try testing BRANCH=next firmware revisions? These are preview versions that we build prior to a kernel branch change or new firmware feature. It also gives us greater granularity of kernel updates.

https://github.com/Hexxeh/rpi-firmware/commits/next?after=f4cb84915874f32d88e2bfe793620d7afe25c755+69

You'll probably want to start around the time that master was broken - f4e7f4731f25ae289d2f983157024bc83e88cf02

and work backwards until it starts working again. If it's the move to 4.4 that caused the breakage then I would expect the last good commit to be cbb6fb920da75da94c1a86f8694f2b3137113011

sudo BRANCH=next rpi-update <hash>

@P33M P33M added Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator. and removed Waiting for internal comment Waiting for comment from a member of the Raspberry Pi engineering team labels May 30, 2017
@phishstang65
Copy link

Any progress on this? I update to the latest kernel with rpi-update on my Zero W and it still does not recognize these devices.

@lategoodbye
Copy link
Contributor

@phishstang65 Any help that narrow down the issue regarding to the affecting commit is welcome.

@yoctopuce
Copy link
Author

HI,

I've done some regression testing and the problem has been introduce in the changelist : Hexxeh/rpi-firmware@2ef7dc7 (kernel: BCM270X_DT: Remove explicit claiming of UART pins).

I've look to the changes made in this revision and I've found the culprit : the hack for Split Interrupt transactions has been enabled by default (7956536).

If I disable this hack in the /boot/cmdline.txt file everything work as expected.

dwc_otg.fiq_fsm_mask=7 dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait

@P33M
Copy link
Contributor

P33M commented Jun 7, 2017

Interesting. In the failing case, what is the output of sudo lsusb -v? That code should have no effect if the Zero is plugged directly into a FS device.

Edit: One possible cause: the blurb associated with the capture traces says that the issue only occurs with Zero products, i.e. if multiple devices are connected then a third-party hub must be used. It's possible that some hubs are not as forgiving as others when it comes to munging packets...

@yoctopuce
Copy link
Author

Here is the output of sudo lsusb -v on the Pi Zero with the firmware Hexxeh/rpi-firmware@2ef7dc7 with no modification
to the commandline.txt.

Bus 001 Device 004: ID 05ac:1402 Apple, Inc. Ethernet Adapter [A1277]
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass       255 Vendor Specific Subclass
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x05ac Apple, Inc.
  idProduct          0x1402 Ethernet Adapter [A1277]
  bcdDevice            0.01
  iManufacturer           1 Apple Inc.      
  iProduct                2 Apple USB Ethernet Adapter
  iSerial                 3 12E1F6
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           39
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          4 0
    bmAttributes         0xa0
      (Bus Powered)
      Remote Wakeup
    MaxPower              250mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol      0 
      iInterface              7 0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0008  1x 8 bytes
        bInterval              11
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x03  EP 3 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass       255 Vendor Specific Subclass
  bDeviceProtocol         0 
  bMaxPacketSize0         8
  bNumConfigurations      1
Device Status:     0x0000
  (Bus Powered)

Bus 001 Device 003: ID 24e0:000d  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0         8
  idVendor           0x24e0 
  idProduct          0x000d 
  bcdDevice            0.01
  iManufacturer           1 Yoctopuce
  iProduct                2 Yocto-PowerRelay
  iSerial                 3 RELAYHI1-85D28
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           41
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      0 No Subclass
      bInterfaceProtocol      0 None
      iInterface              0 
        HID Device Descriptor:
          bLength                 9
          bDescriptorType        33
          bcdHID               1.11
          bCountryCode            0 Not supported
          bNumDescriptors         1
          bDescriptorType        34 Report
          wDescriptorLength      29
         Report Descriptors: 
           ** UNAVAILABLE **
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               1
Device Status:     0x0000
  (Bus Powered)

Bus 001 Device 002: ID 1a40:0101 Terminus Technology Inc. 4-Port HUB
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x1a40 Terminus Technology Inc.
  idProduct          0x0101 4-Port HUB
  bcdDevice            1.11
  iManufacturer           0 
  iProduct                1 USB 2.0 Hub
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           25
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             4
  wHubCharacteristic 0x0000
    Ganged power switching
    Ganged overcurrent protection
    TT think time 8 FS bits
  bPwrOn2PwrGood       50 * 2 milli seconds
  bHubContrCurrent    100 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0103 power enable connect
   Port 2: 0000.0100 power
   Port 3: 0000.0503 highspeed power enable connect
   Port 4: 0000.0100 power
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0002 2.0 root hub
  bcdDevice            4.04
  iManufacturer           3 Linux 4.4.6+ dwc_otg_hcd
  iProduct                2 DWC OTG Controller
  iSerial                 1 20980000.usb
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           25
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             1
  wHubCharacteristic 0x0008
    Ganged power switching
    Per-port overcurrent protection
    TT think time 8 FS bits
  bPwrOn2PwrGood        1 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0503 highspeed power enable connect
Device Status:     0x0001
  Self Powered

  Self Powered

I've also done some test with a Raspberry PI Zero W. If the Yoctopuce device is plugged directly on the PI everything is working but if the Yoctopuce device is plugged on a USB hub I've got IO error. So the problem seems to be the handling of USB Split Transactions.

@P33M
Copy link
Contributor

P33M commented Jun 7, 2017

Does your USB analyzer work at high-speed? Can you capture a trace of a (failed) transfer when snooping the bus between the Pi and the hub?

Note that if you have the Ethernet device active, your trace is likely to be full of network packet spam. Running the test via the UART would be recommended.

@yoctopuce
Copy link
Author

Hi,

I've manage to capture the High Speed trafic between the Raspberry Pi Zero and the USB Hub (http://www.uugear.com/product/zero4u/). USB traces are available here: https://www.yoctopuce.com/tmp/bug_split_transaction.zip and can be viewed with the Visual USB Software from Ellisys (http://www.ellisys.com/products/usbex200/download.php).

The capture file usb_hub_trafic_mask_f.ufo is the one that failed. The cmdline.txt is untouched and
the hack for Split Interrupt transactions is enabled.

The capture file usb_hub_trafic_mask_7.ufo is the one that work. I've added dwc_otg.fiq_fsm_mask=7 to the cmdline.txt and the hack for Split Interrupt transactions is disabled.

Both capture have been done with the same hardware and the same USB devices with the latest offical image:
Linux raspberrypi 4.4.50+ #970 Mon Feb 20 19:12:50 GMT 2017 armv6l GNU/Linux

@P33M
Copy link
Contributor

P33M commented Jun 12, 2017

OK. That's a sufficiently confusing result. The first SSPLIT OUT is accepted (ACK) and the host subsequently tries to queue another SSPLIT IN. This is rejected (NAK) meaning the TT has no space available. We then try to complete the first OUT transfer with a CSPLIT and get... no response.

That's a common hub chipset, so I would expect that if there were a general issue with the hack + this hub chipset then everyone would be screaming at once. There has to be some sort of interaction between the behaviour of the device + hub (or how much data we're sending).

I think to investigate this further, I'll need one of these devices and the hub in question.

@P33M
Copy link
Contributor

P33M commented Jun 12, 2017

@yoctopuce do you have an e-mail address where we can continue the discussion?

@yoctopuce
Copy link
Author

You can contact support@yoctopuce.com

@P33M P33M added confirmed Waiting for internal comment Waiting for comment from a member of the Raspberry Pi engineering team and removed Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator. labels Jun 13, 2017
@P33M
Copy link
Contributor

P33M commented Jun 19, 2017

I can confirm that the bug only happens with a specific type of hub. From our selection of (cheap) hub devices, the Terminus Tech single-TT hub (FE1.1s) appears to be the only chip that exhibits this behaviour.

There are two main failure modes. The first is as reported previously, where an IN start-split is sent, followed by an OUT start-split. The second is NAK'd (no TT buffer space) but subsequent transfers to the TT all result in transfer errors. On the downstream port, the bus ends up suspended (no traffic for 3ms) and the device is inaccessible until it's reconnected. It seems that Linux will issue a hub reset under certain conditions, so the device will reconnect after a while.

The second is where the hub reports a STALL response for the IN transfer, but this does not result in a suspend state on the downstream port.

@P33M
Copy link
Contributor

P33M commented Jun 19, 2017

trace-terminus-broke

This is the first failure mode - note the timestamps matching between the two windows.

@P33M
Copy link
Contributor

P33M commented Jun 19, 2017

terminus-2nd-failure

In the second failure mode that results in a STALL handshake, the only thing that is different is the ordering of complete-splits. The second SSPLIT is issued before the first CSPLIT, which results in a stall. The second CSPLIT results in an ACK. The STALL response in the first transfer indicates that the TT has thrown the response away (should have been a NAK) with the issue of a new start-split.

I believe the hub only has a single TT buffer for non-periodic transfers. Or alternatively, it cannot differentiate between inbound/outbound transfers for non-periodic endpoints with the same address and endpoint number.

@yoctopuce an easy test would be if one of the endpoints could be configured with a different number. Is it possible to get a testing firmware that I can load that changes one of the endpoints to be e.g. EP 2 IN?

@yoctopuce
Copy link
Author

@P33M yes. I will send you that firmware by email.

@P33M
Copy link
Contributor

P33M commented Jun 19, 2017

With the custom firmware, this is confirmed to be an issue with how the hub is responding to multiple transfers to the same device/endpoint. Both the in direction and the out direction are considered to be the same pipe as far as the hub is concerned.

The response when the endpoints are different follows a more sane flow - the CSPLIT for the OUT transfer receives a NYET response instead of everything ending up broken.

terminus-2ep-working

I think that as a software workaround, the TT exclusivity criteria need to be extended to block multiple simultaneous transfers to/from non-periodic endpoints that have the same endpoint number.

@P33M
Copy link
Contributor

P33M commented Jun 20, 2017

Thinking a bit wider, this hub limitation could be tripped over if any combination of non-periodic endpoint types (or Interrupt munged into Control types) share endpoint number on a device.

I have a nominal fix for the issue, will make a PR shortly.

popcornmix added a commit that referenced this issue Sep 10, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>
popcornmix added a commit that referenced this issue Sep 12, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>
popcornmix added a commit that referenced this issue Sep 16, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Oct 2, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Oct 2, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>
popcornmix added a commit that referenced this issue Oct 7, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Oct 10, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Oct 10, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>
popcornmix added a commit that referenced this issue Oct 14, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Oct 14, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Oct 17, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Oct 21, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Oct 23, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Oct 28, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 1, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 5, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 6, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 8, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 11, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 18, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 18, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 18, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 25, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Nov 25, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Dec 7, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Dec 9, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Dec 10, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Dec 16, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Dec 16, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
popcornmix added a commit that referenced this issue Dec 20, 2024
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes #903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes #1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes #1709

dwc_otg: fix split transaction data toggle handling around dequeues

See #1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: #2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: #2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: #2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: #2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes #2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See #2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in #2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: #2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: #3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 #3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in #3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>

drivers: usb: dwc_otg: fix reference passing when checking bandwidth

The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.

It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.

Rationalise (delete) a variable while we're at it.

See #5189

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: stop GCC from patching FIQ functions

Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.

Add CFLAGS to stop this happening in FIQ code.

Also catch one function where notrace wasn't specified.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Avoid the use of align_buf for short packets

Recent kernels (from 6.5) fail to boot on Pi0-3.

This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);

returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).

As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.

Signed-off-by: Dom Cobley <popcornmix@gmail.com>

drivers: dwc_otg: use C11 style variable array declarations

The kernel C standard changed in 5.18.

Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.

Also remove a pointless fiq_state initialisation loop.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants