Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

32 MHz of SPI #103

Closed
rafistolique opened this issue Aug 9, 2019 · 7 comments
Closed

32 MHz of SPI #103

rafistolique opened this issue Aug 9, 2019 · 7 comments

Comments

@rafistolique
Copy link

Hello and thank you already for your efforts. I have so many questions, but I will keep only two.

1: https://www.mantec.be/en/finders/46445-tactile-screen-tft-3-5-spi--480x320--for-raspbe-1435700000002.html?search_query=raspberry&results=162
He announces a SPI of 32 MHz max.
A 3.5 inch (I also know that it is far from good ...) and on a raspberry 3B

But I managed to make it work thanks to Juj. Thank you.

But after several tests I realize that I can not make him exceed 23 MHz (SPI) without graphics bug.
Is it normal that I can not rub the 32 MHz? Why? 32 it's a mean builder, (lies), I'm doing wrong? Or is it normal? At 400 of core I reach a divisor of 18 = 22,222 SPI. (In the top of 23 it always plants.

After two days of research on how to use a console (putty) basic commands, cmake, make -j, retropy tree etc etc, after hours of reading I finally managed to output an image.

I used the drivers -DWAVESHARE35B_ILI9486 = ON (Waveshare 3.5 "480x320)
(What's strange is that on recalbox the only ones working were the 35a and not B.)
so my question: how do I know if it's a "copy" of a Waveshare card? Should I search in other drivers?
It's the good ones? (I have not received any drivers with)

Sorry, three days ago I had never touched a command line. Thanks for reading me.

@juj
Copy link
Owner

juj commented Aug 9, 2019

But after several tests I realize that I can not make him exceed 23 MHz (SPI) without graphics bug.
Is it normal that I can not rub the 32 MHz?

Yes, it can be normal. Assuming that the display is indeed an ILI9486, the manufacturer for the spec sheet only promises up to 15.15MHz speed (this can be computed from the timings listed in the ILI9486 spec sheet, section

17.3.3.Display Serial Interface Timing Characteristics (4-line SPI system) ................................. 301

Screen Shot 2019-08-09 at 11 18 46 PM

where Serial Clock Cycle must take at least 66 nanoseconds. 1 second = 10^9 nanoseconds, so 10^9 ns / 66 ns = 15151515.1515 Hz, or 15.15MHz. That is the maximum speed that the ILI9486 manufacturer guarantees that the display will work at. Anything higher than that is good luck and fortune. If you reach 22MHz, that should be considered fairly good - there was another thread at #92 where the author also was able to use only CDIV 18, and CDIV 16 and smaller would break; so it seems that this can be quite common, depending on manufacturing characteristics, batch quality, etc.

how do I know if it's a "copy" of a Waveshare card? Should I search in other drivers?

If -DWAVESHARE35B_ILI9486 = ON worked for the display, then to my knowledge it is guaranteed to be an ILI9486 (I have not seen any other display controllers that would not be ILI9486 but would happen to work with -DWAVESHARE35B_ILI9486 = ON option - the driver is unique for ILI9486 only)

If you want to be 100% sure, I don't know of another reliable way to identify, except, if possible, to contact the manufacturer to ask what hardware they put in the display.

Hope you have fun with it!

@rafistolique
Copy link
Author

rafistolique commented Aug 9, 2019

Yes I have a lot of fun with this little gem.

But I'm especially meturing about the speed of your answers. You are unprecedented responsiveness, always kind, helpful, and always very complete answers, and for that I thank you very much, it's nice. Thank you.

Coming from a 7-inch official screen, which works without drivers, I thought wrongly, that it would be so easy (I mean in recalbox it is managed right away) (which makes me think , I already have two bartop : The official screen 7 inches it supports the drivers FBCP ili? ( if I can still win two , three FPS: D I do not say no )

At the store (see link) there was more stock in 3.5 inches, this screen and another even cheaper brand Vellemans (which did not inspire me).
On the data sheet (see link (pass the page in French for details of the screen)), it is registered SPI 32 MHz max version, I thought wrong that it was a guarantee of "quality", even if at that moment I had only a poor idea of ​​what the SPI was.
I only tested quake 3, I'm going to test less greedy games. Quake should not be representative. But I still think in the context of my project to change screen (A check after more complete test).
I can not fall back on an HDMI version, I need to deport a few centimeters the raspberry of the screen, and the GPIO connector with a small flat plug is perfect.

One last question (sorry to tell my life and take your time): What do you choose between: 370/16 = 23 OR 256/12 = 23?

Thank you again, good evening. And sorry for my amateurism and my poor English. Thanks also to google translator.

@juj
Copy link
Owner

juj commented Aug 10, 2019

Thank you for being kind as well. Google translate works well here, no problems at all.

Coming from a 7-inch official screen, which works without drivers, I thought wrongly, that it would be so easy (I mean in recalbox it is managed right away) (which makes me think , I already have two bartop : The official screen 7 inches it supports the drivers FBCP ili? ( if I can still win two , three FPS: D I do not say no )

The official Raspberry Pi 7 inch screen uses the special MIPI-DSI connector that only supports the official 7 inch screen. The support is directly implemented on the GPU firmware, which is why it does not need drivers, but unfortunately it is closed specification, so no other displays can utilize it. Fbcp driver does not then apply for the 7 inch screen, since it is straight out from the GPU.

On the data sheet (see link (pass the page in French for details of the screen)), it is registered SPI 32 MHz max version, I thought wrong that it was a guarantee of "quality", even if at that moment I had only a poor idea of ​​what the SPI was.

Oh indeed, now I notice when I change language to French, it lists specifications in English, and indeed they say there that 32 MHz is supported. Then this means that either fbcp-ili9341 is unable to get the timings exactly right to make the display run properly at 32 MHz, or then the display is not working up to as advertised and that can be grounds to return as defective.

Things you can try:

  • if the manufacturer provides a driver, use that to see how performance is like. Use a hardware scope to analyze what the actual achieved bus speed is with that driver.
  • use the upstream fbtft/fbcp driver (may be that manufacturer does not provide a driver, but they only use this one instead), and set its speed to 32MHz. Note that due to the heuristic way that fbtft computes the bus speed, it ends up rounding down the actual achieved speed compared to what is specified in /boot/config.txt. So setting 32MHz might not yield that speed in practice, so using a scope to analyze the bus in practice is definitely needed in this case.
  • try to debug fbcp-ili9341 to see if there may be something off with its timings. Theoretically, there are three places that I can think of in fbcp-ili9341 that affect timings: one is the idle 9th clock cycle after each transmitted byte. Fbcp-ili9341 removes that to speed up the display, but perhaps that can cause synchronization issues. You can try modifying code as follows:
diff --git spi.cpp
@@ -98,9 +98,10 @@ void SetRealtimeThreadPriority()
 // Errata to BCM2835 behavior: documentation states that the SPI0 DLEN register is only used for DMA. However, even when DMA is not being utilized, setting it from
 // a value != 0 or 1 gets rid of an excess idle clock cycle that is present when transmitting each byte. (by default in Polled SPI Mode each 8 bits transfer in 9 clocks)
 // With DLEN=2 each byte is clocked to the bus in 8 cycles, observed to improve max throughput from 56.8mbps to 63.3mbps (+11.4%, quite close to the theoretical +12.5%)
 // https://www.raspberrypi.org/forums/viewtopic.php?f=44&t=181154
-#define UNLOCK_FAST_8_CLOCKS_SPI() (spi->dlen = 2)
+#define UNLOCK_FAST_8_CLOCKS_SPI() ((void)0)
 
 #ifdef ALL_TASKS_SHOULD_DMA
 bool previousTaskWasSPI = true;
 #endif

Remove the line marked with a - and add the line marked with a +. Then rebuild and retest if that can improve faster CDIV settings.

  • a second place that affects timings is the ALL_TASKS_SHOULD_DMA option. Edit
diff --git config.h
@@ -91,7 +91,7 @@
 
 // If defined, DMA usage is foremost used to save power consumption and CPU usage. If not defined,
 // DMA usage is tailored towards maximum performance.
-// #define ALL_TASKS_SHOULD_DMA
+#define ALL_TASKS_SHOULD_DMA

and rebuild, and see if that can help performance.

  • a third possible way to affect timings is to disable DMA by setting -DUSE_DMA_TRANSFERS=OFF in CMake command line. DMA and polled mode have different timing characteristics.

I have not seen in practice that changing any of the above three items would affect achieved CDIV value, but if I was doubting and double-checking, I would start with those.

If you do find that changing some of the above can result in improved total bus speed, I'd be curious to know.

What do you choose between: 370/16 = 23 OR 256/12 = 23?

Definitely the first option 370MHz / 16 is better, since it gives faster Pi CPU core speed. Although if you are running on really light load (e.g. NES games emulation or low/idle CPU cmdline terminal), then due to issue raspberrypi/firmware#992 (see also [1]), that 370MHz will drop down to 200MHz when CPU is under low load, resulting in 200/16 = 12.5MHz bus speed.

If you are commonly/mostly running very idle tasks, using 256 MHz / 12 can yield better overall bus speed, since in that case when the Pi idles, the bus speed will go down to 200 MHz / 12 = 16.67 MHz, which is faster than 12.5MHz.

If you are forcing turbo speed, or are running moderately CPU intensive tasks, e.g. games, then 370/16 will definitely be better.

@rafistolique
Copy link
Author

Hello and thank you for your answer.
Regarding the official 7 inch screen: Ok, here I am less stupid.

So I must confess I'm not sure I understand all, especially the oscilloscope part ... I understand the three test proposals I'm going to explore now.
Regarding the supplier I think it's this: www.haitronic.com
But it does not provide anything, I do not find exactly the same screen as me, and their Chinese site looks "basic" not to say that otherwise. It's dead on this side.

I will not expand for the moment, I will tamper with the code to test the proposed tracks, I will come back explain (the best possible) the results.

Before the tests I have already noted two, three small points, which agitate my curiosity, but I will keep these questions for later, may be that the tests answer themselves to my questions.

Thanks again for all this information that makes us a little less stupid, day after day.

@rafistolique
Copy link
Author

rafistolique commented Aug 11, 2019

Hello, here I am after multiple tests and the results are, we can say, interesting.

I tried a lot of things too long to explain, I'll just give the final result.

In summary: By adding the command #define UNLOCK_FAST_8_CLOCKS_SPI () ((void) 0) I managed to improve the SPI. My record without bug:

360/10 = 36 SPI !!!! And yes that changed everyone.
I thought it was wrong at first that I had to combine #define UNLOCK_FAST_8_CLOCKS_SPI () ((void) 0) AND -DUSE_DMA_TRANSFERS = OFF, but if you confirm that to disregard the -DUSE_DMA_TRANSFERS = OFF command just rebuild without this command, then I can say it, only the command #define UNLOCK_FAST_8_CLOCKS_SPI () ((void) 0) is necessary to unblock the SPI.

In summary I think I made the following change:
#define UNLOCK_FAST_8_CLOCKS_SPI () ((void) 0) and a -DWAVESHARE35B_ILI9486 = ON -DSPI_BUS_CLOCK_DIVISOR = 10

Concerning the modification in config.h it seems incompatible with the command -DUSE_DMA_TRANSFERS = OFF. I confess I have not been to see further, since the 36 SPI was already reaching.

I have to specify that before and after the modification I always activated turbo force 1. (but it works well also with this disabled command).

My impressions: Before modif the FPS was not at all constant, and afterglow problems was more present, here the frame rate is maybe at some point lower but more constant and the image more beautiful (almost afterglow). I was able to reactivate the interlacing feature (message # 83).
Before modification he was no longer present.
I also now display the FPS almost always in red or almost. It seems to me that it was often whiter before the changes (but in practice it works well). I do not know what the red values mean in the FPS display.

That said it was late and I did so many tests that I am not sure of myself. (I will return).
What I know is that the current config pleases me a lot, and has nothing to do with what I had at the beginning.

I'm going to do a clean reinstallation I have a little problem that has (may be) nothing to do with the change: The shutdown or reboot command no longer work properly, and make freezer retropie. I can not turn it off properly.

Here it is: Thanks for the tracks to follow they are effective. I do not know what's going on but I know you're right.

If you have questions, tests to do, or whatever, do not hesitate. I am not at one more reinstallation.

Thanks again.

Edit:

I forgot:
With or without modification, I have a recurring symptom. I have the impression, that whatever the game or in the retropy menu, it takes a lapse of time for the machine to reach its full power. It's hard to explain, but if I'm slowly navigating the retropie menu, I have fewer second frames than when I flick the menu in all directions.
Ditto in sonic and Earthworm Jim and Ghouls N Ghosts "which are the most visible but it does it everywhere, the more animations there is, it turns out well, I exaggerate a bit but it is to make myself explicit." Earthworm jim more fps when I shoot heavy weapons in all directions and sonic, when the hedgehog is propelled by a piston and flies in all directions quickly, it turns better than when I jump on the spot without enemies around me , or if I stayed without moving a certain time.

It's not very embarrassing, but if perfection is not far off why deprive yourself ... I'm already VERY happy with these progrets. I do not regret this screen anymore.

    • 40/50 fps in F-zero with an 8 bitdo ... I'm going crazy ^^ Thanks ^^

EDIT 2
After reinstallation I can not get a good image with a 36 SPI, unless I use again -DUSE_DMA_TRANSFERS = OFF.

@juj
Copy link
Owner

juj commented Aug 11, 2019

360/10 = 36 SPI

Wow, that is absolutely great! Really interesting to know that the idle 9th bit is required for that ILI9486 display. I think I'll need to make it a build option so people know to test both ways.

Can you confirm that if you set -DUSE_DMA_TRANSFERS=OFF but keep #define UNLOCK_FAST_8_CLOCKS_SPI() (spi->dlen = 2) enabled, that configuration does not work, i.e. that both disabling DMA and 8 clocks SPI are definitely required?

I do not know what the red values mean in the FPS display.

Red values mean that some rendered frames were dropped and not outputted at all on the display due to SPI bus speed limit.

disregard the -DUSE_DMA_TRANSFERS = OFF command just rebuild without this command

EDIT 2
After reinstallation I can not get a good image with a 36 SPI, unless I use again -DUSE_DMA_TRANSFERS = OFF.

This confused observation may be due to way CMake works. If you first specify -DUSE_DMA_TRANSFERS = OFF, and then rerun CMake without the directive set at all, it does not actually revert DMA transfers back on, but you must either explicitly specify -DUSE_DMA_TRANSFERS = ON or delete file CMakeCache.txt file in between.

it turns better than when I jump on the spot without enemies around me , or if I stayed without moving a certain time.

This may be due to interlacing option, or battery saving option. Try removing

#define SAVE_BATTERY_BY_SLEEPING_WHEN_IDLE
or uncommenting
// #define NO_INTERLACING

Overall, really nice findings here, thanks for the investigations, learned something myself today.

@rafistolique
Copy link
Author

rafistolique commented Aug 12, 2019

Can you confirm that if you set -DUSE_DMA_TRANSFERS=OFF but keep #define UNLOCK_FAST_8_CLOCKS_SPI() (spi->dlen = 2) enabled, that configuration does not work, i.e. that both disabling DMA and 8 clocks SPI are definitely required?

Yes I confirm.

I started the raspberry after a fresh install passed all in 36 SPI + -DUSE_DMA_TRANSFERS = OFF and it did not work. I replaced the line:

#define UNLOCK_FAST_8_CLOCKS_SPI () (spi-> dlen = 2)
by this one:
#define UNLOCK_FAST_8_CLOCKS_SPI () ((void) 0)

and it worked.
Several times I removed it it was not working, several times I replaced it and it was refitting.
It is clear that the command is mandatory, as well as the parameter -DUSE_DMA_TRANSFERS = OFF.

For my other little problem I have not tested yet. In any case, thank you for your help.

Overall, really nice findings here, thanks for the investigations, learned something myself today.

It is with pleasure, if I can make myself useful while having fun, what to ask for more. Do not hesitate.

@juj juj closed this as completed Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants