-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DMA #13
Comments
I just discovered that slave_id is set from Device Tree:
https://www.kernel.org/doc/Documentation/devicetree/bindings/dma/bcm2835-dma.txt So this should probably be correct:
|
How far down my list? The first 4 patches are out, but not all are merged. The next will be the optimization reducing the interrupt count:
Then I will try to argue for some polling for short transfers (we got a lot of irq overhead and saving here could just as well be spent polling...) Then we can discuss dma for big transfers. I still doubt that we will ever get the fully dma pipelined version into upstream... Martin |
having a look it looks promising what you have started. |
@notro: Things I want to understand from you and your use-case:
|
In the meantime I found your wiki - will try to test it tomorrow. |
Actually i see that the device actually supports 9 bit so we might also try to investigate that one - but only if you want to help with the fb driver side making things possible... |
I do single transfers, but in some experimental code, I have tried sending a control byte as one transfer and the payload in the next, in one SPI message, but that crashes spi-bcm2078 DMA. The pixels in the framebuffer are 16-bit rgb565 little endian on the Pi. Here's a test I did 2 years ago with different transmit buffer lengths. Numbers are full frame update time in ms.
SPI controllers that support 16-bit width (BBB), will get the same buffer treatment.
That depends. With a high fps= value and video playback, it runs almost continously.
I reuse the DMA buffer and it's mapping, but not the SPI message.
If you run with the stock spi-bcm2708, it uses 9-bit natively. So a kernel auquire with: sudo rpi-update, will give you support for the display using the 9-bit mode. I think we should just drop LoSSI. It's not generic 9-bit and there is this other SPI controller available on the header now which supports 2-32 bits (but lacks a driver). About DMA buffer size: Noralf. |
Thanks for all that info. I was actually assuming you would be doing a full frame-transfer, which would make you require more transfer more bytes in one go. At 30fps that would be 300 spi transfers/second and a corresponding amount of interrupts. As for usb latencies with audio: I guess it would again be related to interrupt latencies that block/delay usb transmits. But note again that the irq handler has about 10us overhead (assuming cache-miss for the interrupt code) so with 300 spi transfers we would have 3ms spent only with irq overhead. But imo the irq story is still "ugly", so I am still surprised that the time-critical usb-irqhandling has not been moved to the rtos that is essentially the firmware. Why? Because the videocore has a vectored irq-table, which is something that the arm does not have - it just has 2 interrupts and has to figure out what was actually triggering the issue. Then it has to mask that source to do its work, and afterwards unmask it. If I understood correctly they now think of splitting the usb interrupts into separate fast interrupt handler and configure on each core a different handler, which really makes things faster... One final thing: the fb modules also work in the 4.0rcX, so that I can test it there - we need that for upstreaming! So in the end I shall be applying my logic analyzer against the lines and see what shows up... |
This is strange, because lowering the transmit buffer size to 256 bytes solved the problem, but this means 4 times as many SPI interrupts as with a 4K buffer.
Clearly this processor was never meant to be used in a general purpose computer :-)
Nice to be able to get that ultimate proof, takes away a lot of guessing. Here's some performance numbers: https://github.com/notro/fbtft/wiki/Performance |
So I misread your last message. But let us first get to a situation where we have again something we can use - also with the rpi2... I will try to figure out the dt for my display device and then I can try to start the dma for real. |
BTW: with this test you have been mentioning that introduced "jitter" on USB audio - did that user also have the HDMI output running? I wonder because the way I understand it the HDMI output is driven via DMA as well (just by the firmware). So these multiple sources may produce congestion on the connection to SRAM, which would impact all code-paths that are not in L1/L2 hence the "latencies". Gathering more details would be helpfull for the future to run these kinds of tests... |
I have heard of only two instances of this. See: raspberrypi/linux#888 (comment) |
With very small, couple of bytes transfers, would it be faster to use polling instead of a completion interrupt? Some details about a full frame display update: First set_addr_win is called. Then the framebuffer is sent in 4k chunks with DC=1
can_dma won't work in mainline because bcm2835-dma lacks device_prep_slave_sg. bcm2708-dmaengine has this. I'm hoping to get together a PR for bcm2708-dmaengine next week to make it hand out DMA channels from Device Tree. |
yes - for very short request (total transfer time <10us) it is beneficial to run the driver in polling mode - the interrupt overhead is in the order of 10us, so avoiding that helps speeding up responses... |
Note that I have got now also patches for polling and filling the FIFO prior to enabling interrupts cutting down on latencies. I have also figured out that the spi_sync now is quite optimized and does no longer wake the spi-worker-thread in case that there is nothing in the queue - so it became more efficient than with 3.12 making those "spi_async" no longer an absolute necessity for high-performance drivers... It still misses the consolidation of handling all transfers in interrupt with a wakeup ONLY for the end of message - not for the end of EACH spi_transfer inside a spi_message. |
one other thing: would it not make sense to run the Command and Data as separate SPI-devices, where the core would schedule an extra GPIO up/down? |
No, almost all of the transfer time is for pixel data and the gpio stays high. |
I have just received an USB Sound-card yesterday for testing the latency issues and I have to tell you that the card works with no jitter with the foundation kernel but lots of minimal jitter with upstream kernel which runs 8k interrupts/s on the USB side. Note that there is no accounting for fast-interrupts, so I can not say if fiq are running 8000 times/second. Tests were done using an identical ModelB+ with SPI drivers not installed - only mpg123 was running. To me it looks as if this is related to interrupt-latencies, that become more pronounced with the upstream kernel. Maybe one interpretation why this could impact us with big DMA-transfers negatively: Also the USB interrupts are also evicting SPI interrupt code from cache and vice versa. So the shorter the DMA the more likely the irq-codepath is still in cache and the shorter the response-time. Now let us get to the DMA case and then we can look at it in detail how it affects the interrupt latencies with my instrumented kernel pulling GPIOs low/high. |
The mainline usb driver doesn't use fiq's last time I checked. |
It is (at least on the RPI2 with 3.18.8-v7+):
|
Most other stuff got merged, so we can start address this - as soon as we reach 64 bytes this starts to make sense to use it... |
Ok, i have started now on the dma implementation and have found a few things that may be of interest: A) my basic setup code is similar to what you have posted, but there is one thing that is strange: I only can get both the rx and tx channels on first load of the module a reload only gives me the first channel requested. Need to investigate why this happens, because a reboot is not really acceptable... @notro: have you seen something like this? |
I haven't done things like this except for the patch at the start of this issue, so no I haven't seen it.
Yes, in the case where the client has already mapped it. Is it possible to support both cases? can_dma() checks the flag and returns false if is_dma_mapped. |
If/when you have tested my dmaengine patch, it would be great if you replied with a Tested-by. |
Had been busy elsewhere - now I have hit an issue during registration. In principle it is possible to support both cases - for the is_dma_mapped you only have to create the corresponding scatter/gather list with that premapped entry... As for spidev: the biggest thing here is that spidev still does a copy to/from userspace into a bounce-buffer of (by default) 4096 bytes. So it would be better served by adding is_dma_mapped to that code instead... |
seems as if that issue is resolved - now continuing to work on the DMA implementation itself. |
I didn't look into the why. You can run at any speed with nothing connected to test the software side.
|
As for the time taken - I thought: first get the basics right (which lots of people will be using) before we continue the DMA portion. |
quick test: 48MHz:
64MHz:
64MHz did not show any image on the display (but It might be too fast for that display) in both cases I first ran a fbi to display an image. So it seems to me as if the overhead for DMA becomes expensive!
For now I want to remove that memory remapping overhead and report spi statistics via sysfs... |
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "1 = Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "1 = Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "1 = Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Make use of the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Tested-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Vinod Koul <vkoul@kernel.org> Acked-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Tested-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Vinod Koul <vkoul@kernel.org> Acked-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Tested-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Vinod Koul <vkoul@kernel.org> Acked-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Kauer <florian.kauer@koalo.de> Link: https://lore.kernel.org/r/b2286c904408745192e4beb3de3c88f73e4a7210.1568187525.git.lukas@wunner.de Signed-off-by: Mark Brown <broonie@kernel.org>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Tested-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Vinod Koul <vkoul@kernel.org> Acked-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Tested-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Vinod Koul <vkoul@kernel.org> Acked-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Kauer <florian.kauer@koalo.de> Link: https://lore.kernel.org/r/b2286c904408745192e4beb3de3c88f73e4a7210.1568187525.git.lukas@wunner.de Signed-off-by: Mark Brown <broonie@kernel.org>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Tested-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Vinod Koul <vkoul@kernel.org> Acked-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Tested-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Vinod Koul <vkoul@kernel.org> Acked-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Kauer <florian.kauer@koalo.de>
The BCM2835 DMA controller is capable of synthesizing zeroes instead of copying them from a source address. The feature is enabled by setting the SRC_IGNORE bit in the Transfer Information field of a Control Block: "Do not perform source reads. In addition, destination writes will zero all the write strobes. This is used for fast cache fill operations." https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf The feature is only available on 8 of the 16 channels. The others are so-called "lite" channels with a limited feature set and performance. Enable the feature if a cyclic transaction copies from the zero page. This reduces traffic on the memory bus. A forthcoming use case is the BCM2835 SPI driver, which will cyclically copy from the zero page to the TX FIFO. The idea to use SRC_IGNORE was taken from an ancient GitHub conversation between Martin and Noralf: msperl/spi-bcm2835#13 (comment) Tested-by: Nuno Sá <nuno.sa@analog.com> Tested-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Vinod Koul <vkoul@kernel.org> Acked-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Kauer <florian.kauer@koalo.de>
I have started to look at spi-bcm2835 and DMA. My FBTFT drivers are now included in raspberrypi/linux, but I need a DMA SPI driver there as well, so I can stop releasing my custom kernel.
To get DMA channels from Device Tree, I had to make a change to drivers/dma/bcm2708-dmaengine.c first. With that in place this patch gives me channels and the subsystem now calls into can_dma()
I used this commit to get me going: raspberrypi/linux@f62cacc
This is also helpful: https://github.com/raspberrypi/linux/blob/rpi-3.18.y/drivers/mmc/host/bcm2835-mmc.c
There's of course a lot missing here, but since you are much more versed in SPI and DMA, I'm wondering how far down on your list DMA is now?
bcm2708-dmaengine.c is in need of a cleanup + patching to get this to work, and bcm2835-dma.c will need a patch to get slave config support, so there's a bit of work here before we are home with SPI DMA in mainline.
The nice thing about the can_dma feature, is that the subsystem does the DMA mapping, which means we should be able to get DMA transfer with spidev as well. Since all transfers are checked to see if they are eligble for DMA. http://lxr.free-electrons.com/ident?i=spi_map_buf
This will also fix my problem of sending a SPI message with a 1 byte transfer and a 4k transfer.
This currently fails in spi-bcm2708 dma (crashes driver), but when transfers are looked at individually I guess it will work. Currently I work around this byte prepending by copying to a new buffer.
The text was updated successfully, but these errors were encountered: