-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing SPI frequency/speed in fbtft driver causes usb based audio to play at the wrong speed #1051
Comments
Have you determined the relationship between the "speed" parameter and the audio playback rate? |
speed set to speed=25000000 works ok
I have frpi-bcp running at a interval of 6fps atm, to replicate onto the fbtft device. |
ok distortion slowness only occurs when fbcp is running testing with additional noise a guess that is putting traffic onto the spi bus
This seems less sever now thats the only thing running on the system. |
On board audio is unaffected. |
This might be related: #971 |
seems to work so should
My reading is this should allocated 152? which is 4k aligned (if not explicitly allocated) |
Because (int)(152/1024) = 0? |
Yes, the code does not expect a buffer less than 1 KiB: if (par->txbuf.buf)
sprintf(text1, ", %zu KiB %sbuffer memory",
par->txbuf.len >> 10, par->txbuf.dma ? "DMA " : ""); Anyways, you won't get DMA in the official kernel since the in-kernel SPI driver doesn't support it yet: #1036 |
I though the txbuflen parameter was in K from #971 brain failure, will have a play with other values and report back. |
From that thread:
|
@pelwell yup brain failure on my part |
@notro txbuf by lowering this what limitations and affect does this have? do we effective stall updates once the txbuffer is full? |
txbuflen=152k (Failure clear Stutter) @32mhz txbuflen=1024 -- seems good 2k fails So dropping txbuflen makes the USB audio work with an SPI frequency of 42Mhz, distortion is related to system load and spi activity, so why... We have no DMA, is the SPI txbuf copy taking to long when clocking at 42Mhz? And have I negated the advantage of up clocking the SPI bus as my buffer transfers per cycle are now smaller and therefore more of them needed... |
Revised @32mhz txbuflen=512 ok So @42-32Mhz 512 coincidence... |
@notro does the current rpi-update give DMA? |
The tx buffer is needed because the SPI controller driver does not support 16-bit words (RGB565 pixels), so we have to swap the bytes (little to big endian) before transferring them to the display controller.
No, not yet. See #1036 In the Adafruit forum where this sort of problem surfaced for the first time, an out-of-tree SPI DMA driver was used. I believe Adafruit still uses that driver in their kernels. I also used that driver in my FBTFT kernels (discontinued). |
This would be my guess given the load happened once the systems loaded and audio one of the few applications thats time sensitive , is that happening in the interrupt handler or similar? Does they byte swapping occur before this in a kernel thread? |
Here's a writeup of what happens: When someone writes to /dev/fb1, fb_deferred_io_mkwrite is called which schedules delayed work to happen in a kernel thread. I haven't calculated when a polling transfer kicks in at your speeds. Note: the issue I linked to links to an Adafruit forum thread. That kernel used a DMA SPI driver, so the operation would differ from this one. No polling mode on small transfers for instance.
Not on this display. |
Thought on above reducing txbuf means we end up in Short transfer? Any other testing I can do to shed light on this? |
I have calculated the max transfer length for polled transfers:
@msperl If the FIFO is 120 bytes, shouldn't we always use polled mode for len<120? If you can build the kernel yourself, you can try and disable the transfer buffer and thus the copying, to see if that changes anything. You will get strange colors. //#ifdef __LITTLE_ENDIAN
// if ((!txbuflen) && (bpp > 8))
// txbuflen = PAGE_SIZE; /* need buffer for byteswapping */
//#endif This will lead to writing the framebuffer memory directly over SPI: /* 16 bit pixel over 8-bit databus */
int fbtft_write_vmem16_bus8(struct fbtft_par *par, size_t offset, size_t len)
{
...
/* non buffered write */
if (!par->txbuf.buf)
return par->fbtftops.write(par, vmem16, len); Note: this codepath have hardly seen any testing. |
@notro: in you asked about spi-Fifo : it is 64 Bytes in size. But what I would guess is that you have transfers like this: 4096, 4096,...,4096,64 From what I have seen is that I get "glitches" on USB audio on upstream kernels (without USB fiq) reliably. I am now still in a business trip, but when I get back I will try to cherry-pick the dma patch (which is in 4.2rc already) for a 4.1 kernel and then that may change the picture... |
hmmm ok well from my testing it doesn’t seem related to clocking other than not 25Mhz have set txbuflen on 2x pi's and its ok 600 this is well above the polling limit, will mod up a kernel this week and rule it out completely. |
Wondering if some of the issue s that have come to light in #1077 might be relevant, I am emerging from my SDL hacking(now does the job of fbcp in and accelerated fb driver for the pi), what should i test? |
The latest rpi-update kernel has SPI DMA support. Have you tried that?
|
I have now tried 4.0 NO DMA
Results in distortion, distortion only occurs when updating the framebuffer.
adding the above and rebooting and conducting the same test fixes the issue.... |
Well, that would point to latencies/delays during interrupts which would explain the situation. But there is another option - especially with a rpi model 1 (single CPU):
All of those consume CPU and if there is any time where there is lots of activity not enough CPU may be made available to speaker-test as everything is spent handling interrupts. So the sound output may stall just because of that... Here again dma reduces the number of interrupts, but it still would be a rare event that you would get you in such a situation, so it may not be as "recognizable" as the situation without dma |
Ok thoughts that no point does ALSA indicate a buffer under run from the sound card (this happens on the internal sound occasionally ) of course the usb driver and or HW(tested with 3 different usb sound chips) could be broken. |
When I am back at my logic-analyzer I can try to make some measurements again measuring spi and the analog wave-form (and maybe instrument the interrupt handlers). But what I have seen on an upstream kernel (without fast irq) was that there were situations when we have been inside an interrupt for a few ms. And 1ms is 40 samples so with 8-bit stereo 80 bytes that are consumed. So obviously there is some buffer (say 512bytes) on the device, but eventually it will drain and if no new data comes in to fill up the data, then there will be a glitch in the audio signal. And I think I have seen some times when we ran in interrupt-handlers for 32ms (sic), which would mean 2560 bytes getting taken from the buffer in the meantime... So the USB latency seems really important here, but fast-irq should reduce this. |
OK - a quick initial test shows that when I run: Assuming that is is 16 bit stereo this means 48 samples. This means that there needs to be a data-packet sent every 0.001s or the sound will stop (assuming no buffering). So with SPI-Interrupts consuming CPU this will change. I still need to reproduce the exact situation with DMA disabled for SPI to proof that this is the case... |
On BCM2835 you have a single core which has to handle all interrupts. The USB fiq was implemented because of various hardware requirements meaning any interrupt latency over 125uS caused problems. In this case, the maximum acceptable latency then turns into ~500uS for this application - approximately half the USB device update rate. Very few commonly-encountered kernel operations disable interrupts for this length of time. If the SPI driver is spending a lot of time with interrupts disabled or in a HardIRQ context, then this is a bad behaviour. Using DMA is a solution for this (one interrupt on completion of DMA). On BCM2836, there is no vectored interrupt controller. There is simply a mux that selects which core GPU IRQ or GPU FIQ interrupts go to - which is then chained into the core-local interrupt handling. One possibility that I have yet to implement (had working, but then a kernel version update broke it) is to (ab)use a CPU local mailbox such that USB interrupts occur on a single designated core, but this will not be possible to implement for any other interrupt source. |
Actually the SPI driver just does a complete in the irq context (plus a few register writes): But the big problem is that this interrupt may happen in very short intervals (about every 64 bytes so depending on SPI-frequency it may trigger every 0.0000005s) , so it is essentially consuming CPU handling IRQ (including overhead) as well as context switching leaving no time for anything else when you have to transfer 128000 bytes resulting in 2000 interrupts in 0.0005s (assuming 125MHz)... That is 400 interrupts per 1 ms - and with the 20us per interrupt overhead (measured a few month ago) we are at 0.8ms/ms that we just spend in handling interrupts (without any completion) and this stalls USB sound... Obviously DMA solves that, but only to some extent. Those IPI2 interrupts seem to move the interrupts to CPU 0, so maybe a bit more logic could get applied so that instead of rescheduling the interrupt some of them are handled locally... |
IPIs don't move interrupts off cores. There's no sane way you can cause GPU IRQs to be routed to arbitrary cores. There's a single line coming from the old ARMCTRL block and this can be selected to a single core only. I briefly considered what would happen if you automatically changed the mux setting mid-IRQ handler but that grows into a horrendous monstrosity of handling code very quickly and is not worth doing. Unless they come from the per-CPU local interrupt sources on 2836, all interrupts are handled by a single core. |
OK that IPI2 explains why the total IPI2 counts mostly correlate with interrupts - most interrupts (at least the ones I use) all make use of complete() to wake a kernel-thread and that is what IPI2 essentially does... |
@pssc has your issue been resolved? If so, please close this issue. Thanks. |
@Ruffio I will retest and take appropriate action |
SPI has had DMA enabled by default for ages - closing. |
I am seeing this on multiple pis of differing types and kernel versions setting dtparam to anything other than 25Mhz causes issues.
The text was updated successfully, but these errors were encountered: