-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USB (VCP) communication issues #1709
Comments
You will have a better chance of getting to a solution if you cut your application down to a small test program that still exhibits the problem and upload the source somewhere - a Gist, Pastbin, Google Drive etc. - together with any instructions you think are relevant. An ideal test would involve looping back the USB serial port to the onboard UART, with a small application to emulate your device. |
It would also be useful if we had an indication of how often this problem occurs - is it 1 time in 10? 10000? Does it recur within seconds or hours? In addition to cutting the application down to minimal source/test harness, finding out what provokes the bug more reliably (e.g. repeatedly opening/closing the device in a naive fashion) would make the likelihood of finding it a lot greater. Edit: you say this is reproducible ~20% of the time. That's useful - if we can get a test case that gives us a broken state after ~1 minute then we can iterate. |
Hi, @P33M I've already given that info in my post. It is completely random, but it usually happen 1 time in 5-10. What it is very important and I forgot to add is (I'll edit the post):
@pelwell I'll discuss this with my boss and I'll see what I can do. I guess that I can post something, since the code is very similar to many Serial port communication codes for C/C++ in Linux, that can be found on the net. I guess that you would need a similar setup, but this is delicate. I guess that I would be allowed to send a demo version of the hardware with an NDA to somebody at the Raspbeery PI foundation. |
If the problem is in the USB serial port side as you claim then a simple test harness that opens a serial port on the Pi (/dev/serial0), waits for the greeting and sends the response, ought to be adequate - just attach the pins on the FTDI to header pins 6, 8 and 10 (you don't say if you are using flow control). |
If you suspect this is a USB issue, you might want to see if you can get a trace with usbmon. It's a free driver that's included with Linux that let's you snoop the USB information going into and out of the USB driver. Also, if your company have a few $$$ at your disposal, you can try buying a USB bus analyzer. These can be on the pricey side($1,000) but can be very useful for these kinds of issues. Some standard logic analyzer's have some USB support, but they are very limited compared to an actual bus analyzer. Finally, I think some of FTDI's products are garbage even though they seem to be very, very popular. I tried using them 15 year ago for something and I was having all kinds of issues similar to yours. I think they have improved somewhat since then though. |
@carmarri Does dmesg show anything in case the issue occurs? |
Hi, First of all, sorry for the delay, but have been out of the country for the last week on a work trip. @pelwell As I said on the initial message, I already tested communication through the physical UART ports using the very same software/settings and it is working fine. The problem is on the USB. My device is USB based (FTDI) so it is connected to the Raspberry PI using a USB cable. @Electron752 Thanks for the suggestion. I'll try usbmon. I don't think my boss will be happy to buy a USB Analyzer to debug this issue. Anyway I think this is not related to FTDI, since, as I said in my initial message, I've tried different serial to USB converters with the very same result (Cypress, SiLabs and another Chinesse product). FTDI is probably not the best solution (I don't have much experience with others) but it has been reliably working for the last 8 years. Do you have any suggestion to use a different brand? @lategoodbye dmesg does not show any message when the error occurs. The messages available are from the boot sequence: *** As a sumary of the issue: My device can be connected to the Raspberry PI using either USB or serial cable. With Serial connected to the physical UART pins of the PI, no error detected. When connected using USB the problem occurs. The problem does occurs on board v1, v2 and v3, but not on other boards like BeagleBone, UDOO, etc. I've monitored the serial lines in my device (before the FTDI chip) and when the issue appears the command is received from the computer and a response is sent to it, but an empty buffer is reported on the C++ code. Regards, |
@carmarri I think you have misunderstood. I am suggesting that you can use a working "physical" UART to emulate the greeting handshake in your device, which you then connect to the FTDI USB UART:
|
Hi @pelwell sorry but I indeed missunderstood what you meant. I guess that you want to confirm that this is not a hardware problem. Well, I can confirm you that it is not. since it has been tested with other FTDI-based product and the same problem occurs, while in other boards does not occur. Also tested this hardware in x86 Laptop/Desktop PCs and the problem does not occur either. |
No, that isn't what I want. I believe that you are experiencing a real problem, I am assuming for now that it isn't caused be a faulty device, and I would like to be able to reproduce the problem here. Your options are:
I assumed that option 3 would be easier all round, so that is what I have been advocating. |
Hi @pelwell I'll arrange some test code for you. Please, send me a p.m. with an email address. This is a little bit embarrassing, but my boss is also asking for Name, work address, or some ID that could identify you. @Electron752 I've tested usbmon, but I'm not sure how to understand what is going on. I've noticed different set of messages between both situations (port opened and communicating vs opened and no communication); the main thing I've noticed is that when everything goes fine there is some kind of periodic control messages on the USB bus, no matter if its transmiting useful data or not (this does not happen when the issue occurs). So I guess, that for some reasone, the USB communication is not being started, but no error is reported to the kernel, so open() returns "no error". |
GitHub doesn't do PM, but if you email me (phil@raspberrypi.org) saying what information you need I'll get back to you. |
Thanks @pelwell I'll contact you tomorrow. I really appreciate your time and patience :-) |
With the help of the supplied test code we've been able to reproduce the problem, capture traces using a USB analyzer and the kernel usbmon facility. Both show long periods where the device isn't polled for data, and if we don't poll it can't send anything. These gaps correspond to the stalls in the data flow as seen by the test client, so there is clear evidence of a bug. The resident expert on the USB stack (@P33M) is looking forward to opening that can of worms again. |
@carmarri |
I now have both usbmon and analyzer traces for each side of the interface (host driver vs actual hardware). It appears that a bulk URB is submitted to the HCD for the IN endpoint of the FTDI device, but the USB transfer doesn't take place for upwards of 700ms. The completion callback is then triggered. Prior to this, a pile of transfers were submitted to the control endpoint which presumably form part of the usb-serial driver's initialisation of the device. There's only a few places in the driver where nak_holdoff can alter the state flow so this narrows down the scope of investigation considerably. |
The cause of the issue has been identified - our nak_holdoff mechanism doesn't take into account URBs being removed from the queue as a trigger to reset the qh->nak_frame count. The hold-off records the last frame number that a NAK response was received, plus a configurable interval. The queue of subsequent transfers to the endpoint would then be ignored by the driver until (frame_number + interval) had expired. Long story short, this could cause delays of up to 2 seconds in getting data from a bulk or control full- or low-speed IN endpoint as a stale qh->nak_frame value would erroneously throttle the endpoint by skipping the endpoint queue for a large interval until the frame counter wrapped back around to the nak_frame value. Setting nak_holdoff=0 avoided this path as queues are unconditionally executed. I've got a 2-line fix that appears to resolve the issue in the test program. Will push this in a PR shortly. |
Hi! I'm really sorry for the delay in my reply but I'm just back to the office after having taken some days off. Phil @P33M, do I need to change the value "dwc_otg.nak_holdoff=0" in my system or will that be internally fixed in the next system update? Based on your words I guess this was only a patch to bypass the issue. |
Setting |
I've got a fix for the first instance of the problem (high latency when requesting data on an open() and read() call) and I think I'm now seeing
in the test program, i.e. you open the port, send a command, the server sees the command and responds (via the USB uart) but you never get any data out of your read() call. This seems to be happening once every 40-50 times the port is opened, rather than the deterministic behaviour seen previously. Edit: It appears that in this specific failure mode, the first data received is dropped on the floor. If I subsequently do "echo steve > /dev/ttyAMA0", "steve" pops out in the USB test program and subsequent reads return data, but I never see the first response from the remote server. |
If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes raspberrypi#1709
I've seen an instance of the correct response being sent across the wire after the first message is sent after an open(), but the first URB returned to usb-serial after an open() is just the FTDI 2-byte status packet. Further investigation required. Edit: the failure still occurs even if nak_holdoff=0. |
kernel: dwc_otg: make nak_holdoff work as intended with empty queues See: raspberrypi/linux#1709 kernel: vc4_fkms: Apply firmware overscan offset to hardware cursor See: raspberrypi/linux#1960
kernel: dwc_otg: make nak_holdoff work as intended with empty queues See: raspberrypi/linux#1709 kernel: vc4_fkms: Apply firmware overscan offset to hardware cursor See: raspberrypi/linux#1960
I believe I now have a fix for the failure on open(). The issue was caused by In b) the driver didn't save the endpoint data toggle state correctly. Thus the next transfer in c) would get dropped on the floor, and if it contained actual serial port data then that data got lost. I also found an issue regarding uninitialised packet counts in the host channel structure, where stale hc->start_pkt_count values would result in bogus lengths being passed back to the ftdi_sio driver on transfers that had data toggle errors. The test program now runs for >10mins with no errors. I will create a PR shortly. |
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the *unallocated* struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in c4564d4). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: raspberrypi/firmware#21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with eb1b482. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to c8edb23. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every *microframe* in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit 6ce0d20 changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes #903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes #1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes #1709 dwc_otg: fix split transaction data toggle handling around dequeues See #1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: #2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: #2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: *** [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: *** [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: #2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In 282bed9 (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: #2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit 51b1b64] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation Rationalise the offset and update all call sites. Fixes #2408 dwc_otg: fix bug with port_addr assignment for single-TT hubs See #2734 The "Hub Port" field in the split transaction packet was always set to 1 for single-TT hubs. The majority of single-TT hub products apparently ignore this field and broadcast to all downstream enabled ports, which masked the issue. A subset of hub devices apparently need the port number to be exact or split transactions will fail. usb: dwc_otg: Clean up build warnings on 64bit kernels No functional changes. Almost all are changes to logging lines. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> usb: dwc_otg: Use dma allocation for mphi dummy_send buffer The FIQ driver used a kzalloc'ed buffer for dummy_send, passing a kernel virtual address to the hardware block. The buffer is only ever used for a dummy read, so it should be harmless, but there is the chance that it will cause exceptions. Use a dma allocation so that we have a genuine bus address, and read from that. Free the allocation when done for good measure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> dwc_otg: only do_split when we actually need to do a split The previous test would fail if the root port was in fullspeed mode and there was a hub between the FS device and the root port. While the transfer worked, the schedule mangling performed for high-speed split transfers would break leading to an 8ms polling interval. dwc_otg: fix locking around dequeueing and killing URBs kill_urbs_in_qh_list() is practically only ever called with the fiq lock already held, so don't spinlock twice in the case where we need to cancel an isochronous transfer. Also fix up a case where the global interrupt register could be read with the fiq lock not held. Fixes the deadlock seen in #2907 ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net> usb: dwc_otg: Clean up interrupt claiming code The FIQ/IRQ interrupt number identification code is scattered through the dwc_otg driver. Rationalise it, simplifying the code and solving an existing issue. See: #2612 Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: Choose appropriate IRQ handover strategy 2711 has no MPHI peripheral, but the ARM Control block can fake interrupts. Use the size of the DTB "mphi" reg block to determine which is required. Signed-off-by: Phil Elwell <phil@raspberrypi.org> usb: host: dwc_otg: fix compiling in separate directory The dwc_otg Makefile does not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> dwc_otg: use align_buf for small IN control transfers (#3150) The hardware will do a 4-byte write to memory on any IN packet received that is between 1 and 3 bytes long. This tramples memory in the uvcvideo driver, as it uses a sequence of 1- and 2-byte control transfers to query the min/max/range/step of each individual camera control and gives us buffers that are offsets into a struct. Catch small control transfers in the data phase and use the align_buf to bounce the correct number of bytes into the URB's buffer. In general, short packets on non-control endpoints should be OK as URBs should have enough buffer space for a wMaxPacket size transfer. See: #3148 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: Declare DMA capability with HCD_DMA flag Following [1], USB controllers have to declare DMA capabilities in order for them to be used by adding the HCD_DMA flag to their hc_driver struct. [1] 7b81cb6 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Signed-off-by: Phil Elwell <phil@raspberrypi.org> dwc_otg: checking the urb->transfer_buffer too early (#3332) After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't work well on Pi2/3 boards with 1G physical ram. Users experience the failure when copying a file of 600M size to the USB stick. And at the same time, the dmesg shows: usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0 When this happens, the sg_buf sent to the driver is located in the highmem region, the usb_sg_init() in the core/message.c will leave transfer_buffer to NULL if the sg_buf is in highmem, but in the dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer is NULL. The driver can handle the situation of buffer to be NULL, if it is in DMA mode, it will convert an address from transfer_dma. But if the conversion fails or it is in the PIO mode, we should check buffer and return -EINVAL if it is NULL. BugLink: https://bugs.launchpad.net/bugs/1852510 Signed-off-by: Hui Wang <hui.wang@canonical.com> dwc_otg: constrain endpoint max packet and transfer size on split IN The hcd would unconditionally set the transfer length to the endpoint packet size for non-isoc IN transfers. If the remaining buffer length was less than the length of returned data, random memory would get scribbled over, with bad effects if it crossed a page boundary. Force a babble error if this happens by limiting the max transfer size to the available buffer space. DMA will stop writing to memory on a babble condition. The hardware expects xfersize to be an integer multiple of maxpacket size, so override hcchar.b.mps as well. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: pause when cancelling split transactions Non-periodic splits will DMA to/from the driver-provided transfer_buffer, which may be freed immediately after the dequeue call returns. Block until we know the transfer is complete. A similar delay is needed when cleaning up disconnects, as the FIQ could have started a periodic transfer in the previous microframe to the one that triggered a disconnect. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s) On BCM2835, there is no hardware guarantee that multiple outstanding reads to different peripherals will complete in-order. The FIQ code uses peripheral reads without barriers for performance, so in the case where a read to a slow peripheral was issued immediately prior to FIQ entry, the first peripheral read that the FIQ did could end up with wrong read data returned. Add dsb(sy) on entry so that all outstanding reads are retired. The FIQ only issues reads to the dwc_otg core, so per-read barriers in the handler itself are not required. On BCM2836 and BCM2837 the barrier is not strictly required due to differences in how the peripheral bus is implemented, but having arch-specific handlers that introduce different latencies is risky. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> dwc_otg: whitelist_table is now productlist_table dwc_otg: initialise sched_frame for periodic QHs that were parked If a periodic QH has no remaining QTDs, then it is removed from all periodic schedules. When re-adding, initialise the sched_frame and start_split_frame from the current value of the frame counter. See https://bugs.launchpad.net/raspbian/+bug/1819560 and #3883 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Minimise header and fix build warnings Delete a large amount of unused declaration from "usb.h", some of which were causing build warnings, and get the module building cleanly. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc-otg: fix clang -Wignored-attributes warning warning: attribute declaration must precede definition dwc-otg: fix clang -Wsometimes-uninitialized warning warning: variable 'retval' is used uninitialized whenever 'if' condition is false dwc-otg: fix clang -Wpointer-bool-conversion warning warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true' The wMaxPacketSize field is actually a two element array which content should be accessed via the UGETW macro. dwc_otg: fix an undeclared variable Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before. Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn> dwc_otg: Update NetBSD usb.h header licence NetBSD have changed their licensing requirements such that the 2-clause licence is preferred. Update usb.h in the downstream dwc_otg code accordingly. See https://www.netbsd.org/about/redistribution.html for more information. Signed-off-by: Phil Elwell <phil@raspberrypi.com> dwc_otg: pay attention to qh->interval when rescheduling periodic queues A regression introduced in #3887 meant that if the newly scheduled transfer immediately returned data, and the driver resubmitted a single URB after every transfer, then the effective polling interval would end up being approx 1ms. Use the larger of SCHEDULE_SLOP or the configured endpoint interval. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: Fix fallthrough warnings Signed-off-by: Alexander Winkowski <dereference23@outlook.com> drivers: usb: dwc_otg: fix reference passing when checking bandwidth The pointer (struct usb_host_endpoint *)->hcpriv should contain a reference to dwc_otg_qh_t if the driver has already seen a URB submitted to this endpoint. It then checks whether the qh exists and is already in a schedule in order to decide whether to allocate periodic bandwidth or not. Passing a pointer to an offset inside of struct usb_host_endpoint instead of just the pointer means it dereferences bogus addresses. Rationalise (delete) a variable while we're at it. See #5189 Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> drivers: dwc_otg: stop GCC from patching FIQ functions Configuring GCC to use task stack protector canaries means it will insert calls to check functions in FIQ code. This is bad, as a) the FIQ's stack is banked and b) the failure invokes __stack_chk_fail which eventually tries to call printk(). Printing to the console inside the FIQ is generally fatal. Add CFLAGS to stop this happening in FIQ code. Also catch one function where notrace wasn't specified. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com> dwc_otg: Avoid the use of align_buf for short packets Recent kernels (from 6.5) fail to boot on Pi0-3. This has been tracked down to the call to: ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus); returning garbage in hubstatus (it gets the uninitialised contents of a kmalloc buffer that is not overwritten as expected). As we don't have strong evidence that this code path has ever worked, and it is causing a clear problem currently, lets disable it to allow wider use of newer kernels. Signed-off-by: Dom Cobley <popcornmix@gmail.com> drivers: dwc_otg: use C11 style variable array declarations The kernel C standard changed in 5.18. Remove a layer of indirection around the FIQ bounce buffers, be consistent with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping of DMA addresses. Also remove a pointless fiq_state initialisation loop. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Hello,
I've noticed some issues in the USB communication. I'll try to give us much information as possible
More information (data, pictures, etc) on previous discussion on Raspberry PI forum: https://www.raspberrypi.org/forums/viewtopic.php?f=33&t=161922&p=1062581#p1062581
Introduction
I'm developing a SO library (C/C++) for one of my company products, using codeblocks and cross-compiling with GCC from Ubuntu Mate. This hardware product uses FTDI and is connected to the Raspberry PI using a USB cable. At the software layer, the communication is performed using the Virtual Comm Port, since I want to allow the Library to detect any port, no matter if its VCP or a physical port. In order to handle the communications I use the kernel library commands open(), close(), read(), write(), ioctl(), termios structures, etc.
The Library has been tested from the raspberry PI through Lazarus (free-pascal) and Python, having both the same behaviour/issue.
Issue
If I open the port, send and receive data to/from my device, the communication runs perfect until I close the port. So what I've noticed is that, randomly (maybe once every 5 times), when I try to open the port, the port is succesfully opened (all functions return succeed) but no communication can be performed. I mean, Once the port is open, I send a fixed ASCII command to our product and I expect a fixed response (product detection through an ID). The command is sent sucessfully, in fact, no errors are reported by the write() function, then I've a timeout in the response and I don't get any data. The command should be received within 20-30ms and I've a timeout of 1 second, which I've experimentally increased during testing up to 10 seconds. The bytes available in the buffer/ports are zero. Then, no matter if I send more commands or not, the communication still fails.
So if I open the port and I'm able to communicate, everything goes fine until I close the port; but if the first command fails, then the communication keeps failing until I close the port.
I've monitored the TTL serial lines after the FTDI in my device (see link) and I've noticed that when the communication fails at the software layer, it is actually being performed at the hardware layer.
I'm using a Raspbian Lite Image to which I've installed LXDE and dependencies, since the application is targeting an embedded touch-screen system.
Testing/Debug
I'm not sure what could be the problem but maybe there is a problem while handling the USB interrupts (kernel issue); I initially though that could be something related to UART syncronization but this does not make sense in USB Virtual Comm Ports, since everything is software, handled by the USB driver.
It took me one month of testing before I had enough data to report this, so please, take this seriously.
Since I work indirectly for the automotive industry and there are many companies all over the globe pretending to develop projects for the Raspberry PI, it will be a pitty that the Raspberry PI is discarted or not recommended if they start noticing this kind of issues.
Please, feel free to ask for more data if you need it.
Thank you very much in advance.
The text was updated successfully, but these errors were encountered: