Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPIO UART(S) dropping characters #1017

Open
ghmpi opened this issue Jul 10, 2018 · 31 comments
Open

GPIO UART(S) dropping characters #1017

ghmpi opened this issue Jul 10, 2018 · 31 comments

Comments

@ghmpi
Copy link

ghmpi commented Jul 10, 2018

Later versions of Raspbian (around 4.14.34-v7+ and beyond) seem to drop received characters from the serial uart, approx 4 characters every second or two @ 19200 baud. LITE version has NO PROBLEMS, problems show up only on full desktop versions. Dropping to 16 meg of GPU memory or below seems to solve the problem as well. More details at this post:

https://www.raspberrypi.org/forums/viewtopic.php?t=217702

Thank you!

@pelwell
Copy link
Contributor

pelwell commented Jul 11, 2018

The fact that this seems to affect both UARTs equally makes me suspect that this isn't a UART hardware or driver problem. Remember that both UARTs have been extensively tested as part of the Bluetooth support on the WiFi Pis - I found and fixed at least two bugs in the PL011/ttyAMA0 driver.

If you drop the GPU RAM to 16MB you will be selecting the start_cd (cutdown) variant which lacks 3D, codec and camera support, but that shouldn't make any difference unless you are trying to use those features.

  1. What is the transmitter in your setup?

  2. Which application is receiving the data?

  3. Are you using RTS/CTS flow control on pins 16 and 17? You will need to find a way to set the pin functions - raspi-gpio, or one of the many other GPIO manipulation utilities and library - you want Alt function 3 for UART0/PL011/ttyAMA0, and Alt function 5 for UART1/8250/ttyS0.

Start your testing with the best UART - the PL011 that appears as ttyAMA0. On a WiFi Pi you will need to first relieve it of its Bluetooth duties by adding dtoverlay=pi3-miniuart-bt to config.txt (and rebooting). 19200 baud to the PL011 should gives you an interrupt latency tolerance of about 1ms before data is lost - this ought to be plenty of time.

You can also narrow down the contributing factors by running the full Raspbian in console mode - see the Boot options in the raspi-config utility and corresponding "Preferences -> Raspberry Pi Configuration" GUI application.

@ghmpi
Copy link
Author

ghmpi commented Jul 11, 2018

I've tried the dtoverlay=pi3-miniuart-bt (other uart) as well, no change. I've ran full Raspbian in console mode as well, no change. The transmitter was originally an atmel mcu.. but just to take that out of the loop, I've since been using just a plain old jumper on the TX/RX lines and watching the characters come back.

The original application was using python, but since using the loopback echo I've been using minicom and screen to access the port, both similar results. Here are some tests I completed just pasting a line and watching it echo back.. the number of characters lost is very predictable, but curiously changes between using minicom and screen. I can tell you I could not get any character loss at 2400! Here are the results. Open it with a fixed font editor, the first line of each test is the original line sent followed by lines where characters were missing.

rpi-uart-tests.txt

Note that I have swapped between a 3 and 3+ RPI as well, no change. I've swapped multiple power supplies, even went as far as unplugging all USB and even the HDMI screen, in case something was interfering. The few things I've been able to narrow down are that I've never seen an issue with any version of STRETCH LITE OR any FULL STRETCH version prior to 2018-04-18-raspbian-stretch. It all seems to start with the full version of 2018-04-18-raspbian-stretch. I actually ran through and tested ALL FULL versions of STRETCH from the first 2017-08-16 till current. Also, an older version that does not have the problem, starts having the problem as soon as an update is done.

It still could be my setup, sure, but I'm running out of things to test.

Thank you for any help you can provide.

@ghmpi
Copy link
Author

ghmpi commented Jul 11, 2018

An update.. on the newest kernel I was able to get the full uart to work reliably by adding..

gpu_mem=8
arm_freq=600
arm_freq_min=600

to /boot/config.txt

Changing GPU memory to 64 but keeping the 600 arm_freq and arm_freq_min caused it to fail again. I switched back and forth between these configs several times to confirm the results. This is on a 3+.

detailed results...

Linux raspberrypi 4.14.52-v7+ #1123 SMP Wed Jun 27 17:35:49 BST 2018 armv7l GNU/Linux
/boot/config.txt settings comparison

#NOT dropping characters, using full uart and locking arm_freq, 8 MEG GPU
enable_uart=1
gpu_mem=8
dtoverlay=pi3-miniuart-bt
arm_freq=600
arm_freq_min=600

#dropping characters, using miniuart and locking arm_freq, 8 MEG GPU
enable_uart=1
gpu_mem=8
#dtoverlay=pi3-miniuart-bt
arm_freq=600
arm_freq_min=600

#dropping characters, using full uart but free arm_freq, 8 MEG GPU
enable_uart=1
gpu_mem=8
dtoverlay=pi3-miniuart-bt
#arm_freq=600
#arm_freq_min=600

#dropping characters, using full uart, locked arm_freq, but 64MEG on GPU
enable_uart=1
gpu_mem=64
dtoverlay=pi3-miniuart-bt
arm_freq=600
arm_freq_min=600

@pelwell
Copy link
Contributor

pelwell commented Jul 12, 2018

You didn't answer my question about flow control, but I suspect the answer would be no.

I spend quite a bit of time controlling a Pi (usually a 3 or 3+) via the UART at 115200 baud, and I haven't noticed any corruption, so I constructed a text file containing many thousands of lines of your test string and, running in a shell over the UART I typed:

$ cat > foo.txt

then pasted the text from my PC's clipboard. Once it had finished dribbling across I analysed the results:

pi@raspberrypi:~$ sort foo.txt  | uniq -c
  21760 |__--==~~==--__|__--==~~==--__|__--==~~==--__|__--==~~==--__|__--==~~==--__|__--==~~==--__|__--==~~==--__END

i.e. all 21760 lines were received correctly.

Next (OK - if I'm honest, the second time around, after it failed the first time) I disabled the console on ttyAMA0. Then I connected a patch cable from TX to RX. In one GUI terminal window I ran:

pi@raspberrypi:~ $ stty -F /dev/ttyAMA0 115200 raw -echo
pi@raspberrypi:~ $ cat /dev/ttyAMA0 > foo2.txt

and in another I ran:

pi@raspberrypi:~ $ cat foo.txt >/dev/ttyAMA0

After the second, outbound cat completed I stopped the first with Ctrl-C. Comparing the two files I found they were identical:

pi@raspberrypi:~ $ diff foo.txt foo2.txt
pi@raspberrypi:~ $

Repeating the test at 921600 baud resulted in a single cluster of 8 dropped bytes, which isn't very surprising without flow control.

At this point I suggest you repeat my tests with cat - leave out minicom - and see how your results compare. If you do get a difference, the sort command above is useful for summarising the corruptions.

@ghmpi
Copy link
Author

ghmpi commented Jul 12, 2018

Correct, no flow control was used.

I repeated with your cat tests.. I still see dropped characters. I even re-imaged a disk with a brand new 2018-06-27-raspbian-stretch and repeated, making only minimal changes. I repeated it again with the wifi off, in that case that changed anything, similar results. 115200 as you did above.

You are using a full desktop release of Raspbian, correct? Because I don't have any issues with any version of the lite non-desktop versions, I repeated your test on a stretch lite copy as well, results were identical files with no dropped characters.

=======================================================

enable_uart=1
gpu_mem=64
dtoverlay=pi3-miniuart-bt
#arm_freq=600
#arm_freq_min=600

pi@raspberrypi:/uart-test $ sort foo-should-fail-2.txt | uniq -c
1 |=~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
-==
==--
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|-=~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|--==~~==----====--|--====--|--====--|--====--|--====--END
1 |
--==
==--|--====--|--====|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|--==~~==--|--==~~==--|--~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|--==~~==--|--==~~==--|__--==~~==--
|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|--==~~==--|--==~~==--|--==~~==--|--==~~==--==--|--====--END
1 |
--====--|--====--|--====--|--====--|--====--|_=--|__--====--END
1 |
--====--|--====--|--====--|--====--|--====--|--====--|--====-END
1 |
--==
==--|--====--|--====--|--====--|--===|--====--|--====--END
1 |
--====--|--====--|_--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
-
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
--
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--__END
1 _--==
==--
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
19811 |
--==
==--
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|__--==~~==--__END
pi@raspberrypi:
/uart-test $

===================================

brand new image of 2018-06-27 Stretch full
modifications made:
customer wpa supplicant loaded to boot
ssh file in boot
remove serial console from cmdline.txt
and added the following 3 lines in /boot/config.txt

enable_uart=1
gpu_mem=64
dtoverlay=pi3-miniuart-bt

pi@raspberrypi:/uart-test-new $ sort newtest.txt | uniq -c
1 |=~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|--=~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|--==~~==--|--====--|--====--|--====--|--====--|--====--END
1 |
--==
==--|--====--__--====--|--====--|--====--|--====--|--====--END
1 |
--====--|--====--__
==--|--====--|--====--|--====--|--====--END
1 |
--====--|--====--|--|--====--|--====--|--====--|--====--END
1 |
--====--|--====--|--====--|===--|--====--|--====--|--====--END
1 |
--==
==--|--====--|--====--|--===--|--====--|--====--|--====--END
1 |
--==
==--|--====--|--====--|--====-|__--====--|--====--|--====--END
1 |
--====--|--====--|--====--|--====--===--|--====--|--====--END
1 |
--==
==--|--====--|--====--|--====--|--====--|--===--|--====--END
1 |
--==
==--|--====--|--====--|--====--|--====--|--==~~==--__
==--END
1 |
--====--|--====--|--====--|--====--|--====--|--====--|_--====--END
1 |
--==
==--
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~=|--====--END
1 |
--==
==--
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--|--====--END
1 |
--==
==--
|--==~~==--|--==~~==--|--==~~==--|--==~~==----====--|--====--END
1 |
--====--|--====--|--====--|--====--|--==_--====--|--====--END
1 |
--==
==--|--====--|--====--|_--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|--==~~=|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|--==_--====--|--==~~==--|--==~~==--|--==~~==--|--====--END
1 |
--==
==--
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~D
18223 |
--====--|--====--|--====--|--====--|--====--|--====--|--====--END
1 |
--==
==--|--====--|--====--|--====--|--====--|--====--|--====--END|--====--|--====--|--====--|--====--|--====--|--====--|--====--END
1 |
--==
==--|--====--|--====--|--====--|--====--|--====--|--====--END--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|_--==~~==--__END
pi@raspberrypi:~/uart-test-new $

===================================

@pelwell
Copy link
Contributor

pelwell commented Jul 12, 2018

3B+, WiFi enabled, full clean Raspbian 2018-06-27, booted to desktop, same modifications as you, 115200 and 921600 - all give correct results. If I stress it enough it will drop data - that's what flow control is for - but I don't see anything untoward.

@burtyb
Copy link

burtyb commented Jul 12, 2018

I'm seeing this too, sending even one byte more than the hardware buffer size (8/16 bytes) to the Pi seems to be hit/miss with current Raspbian with desktop kernel.

@pelwell
Copy link
Contributor

pelwell commented Jul 12, 2018

You may well be seeing a problem, but there is no such thing as a "desktop kernel" - all configurations of Raspbian use the same kernel (the one that matches the hardware). If you have a test script that provokes the data loss then please post it here.

@ghmpi
Copy link
Author

ghmpi commented Jul 12, 2018

Any lite version has no problems.
Any desktop version 2018-04-18-raspbian-stretch or later has problems unless I add this to config.sys
gpu_mem=8
dtoverlay=pi3-miniuart-bt
arm_freq=600
arm_freq_min=600

I've tried everything with my hardware I can imagine, 3 power supplies, 2 different Pi, USB boot, SD card boot, removing HDMI and all USB peripherals, turning off wifi. I'm running out of ideas on my own.
There must be some sort of interaction somewhere, I can't explain it, but lite works so consistently and desktop does not (after a certain version), I don't know what to make of it.

I have no script. I repeated the tests with your simple cat commands on a brand new 2018-06-27-raspbian-stretch with minimal changes as stated in my comment above, that is as close to a repeatable/simple test I have done.

I guess if is there is nothing else to try we just wait and see if anyone else reports anything. We got one more so far (burtyb).

Thanks!

@ghollingworth
Copy link
Contributor

Have you tried switching to powersave mode to make sure the clocks are not causing a problem?

just need to:

echo "powersave" | sudo tee /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

@ghmpi
Copy link
Author

ghmpi commented Jul 12, 2018

Gordon,

I just tried the powersave, no change on my side. Thanks for the suggestion.

Regards,
-Moses

@ghollingworth
Copy link
Contributor

OK, Can you list exactly the minimum sequence of steps you need to reproduce the problem assuming you just loop back the UART on the GPIO connector?

List which pins you connect, anything you change in config.txt from the clean Raspbian download (including a link). Are you using stty to set the baud rate and parameters?

Gordon

@ghmpi
Copy link
Author

ghmpi commented Jul 13, 2018

OS: 2018-06-27-raspbian-stretch.img
modifications made:
- install a jumper on GPIO TX and RX pins
- basic wpa_supplicant.conf copied to boot
- "ssh" file created in boot
- remove serial console from cmdline.txt
- add the following 3 lines to /boot/config.txt
enable_uart=1
gpu_mem=64
dtoverlay=pi3-miniuart-bt

created a file named foo.txt with a 20,000 or so lines of this..
|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|--==~~==--|__--==~~==--__END

IN ONE TERMINAL RUN...
pi@raspberrypi:~ $ stty -F /dev/ttyAMA0 115200 raw -echo
pi@raspberrypi:~ $ cat /dev/ttyAMA0 > foo2.txt

IN ANOTHER TERMINAL RUN...
pi@raspberrypi:~ $ cat foo.txt >/dev/ttyAMA0
Wait for a bit

the following works good for seeing bad lines..
pi@raspberrypi:~ $ sort foo2.txt | uniq -c

Hopefully you guys can confirm this.. otherwise I'm packing up, driving 100 miles east and trying this again.

Regards,
-Moses

@ghollingworth
Copy link
Contributor

Is that the minimum set of changes required to reproduce the problem?

Do you really need to change gpu_mem to trigger the problem?
Why are you enabling ssh?
So you just have a jumper between pins 8 and 10 (not a wire)?
Why do you switch to the miniuart, will you still reproduce without this line?

@pelwell
Copy link
Contributor

pelwell commented Jul 13, 2018

The dtoverlay line switches Bluetooth to the mini-UART, making the more capable UART available for other applications. dtoverlay=pi3-disable-bt is another option, freeing up both UARTs.

@ghollingworth
Copy link
Contributor

Although I don't think that's necessary to reproduce the problem...

@pelwell
Copy link
Contributor

pelwell commented Jul 13, 2018

We want to test ttyAMA0 because it should be the UART least likely to drop data, and we can't test ttyAMA0 if Bluetooth is using it, so using one of the -bt overlays is a necessary step to give this issue the highest priority.

@ghmpi
Copy link
Author

ghmpi commented Jul 13, 2018

Gordon,

The setup above was my last minimal test, not necessarily the most minimal test. I forced GPU_MEM just to give it a value, stock fails too. I enabled ssh because I use it, I don't believe it has any effect on the results. I've likely tried both a jumper jumper and a jumper wire, I don't remember for specific tests. We started testing the miniuart for whatever reasone.. and we just went with it, they both seem to behave the same.

burtyb, Can you tell us more about your issue? Does it match what I'm seeing where the problem goes away with any of the lite versions?

Regards,
-Moses

@burtyb
Copy link

burtyb commented Jul 14, 2018

Meh, I meant the current kernel on desktop Raspbian but as the stance seems to be use hardware control (where I agree it works) I'll just remove the connector from future revisions of my HATs.

@pelwell
Copy link
Contributor

pelwell commented Jul 14, 2018

We're not saying there isn't a problem here, but I followed the instructions and didn't see it myself. Hardware flow control is the only sure way to get reliable transfers over a UART, although the -rt kernel may do better under moderate load.

@maxnet
Copy link

maxnet commented Jul 15, 2018

I've ran full Raspbian in console mode as well, no change

Full Raspbian does install extra daemons that also run in console mode, such as bluealsa.
May want to try disabling every systemd service you do not need, to see if it makes any difference.

@bg3mdo
Copy link

bg3mdo commented Jul 6, 2019

I have the same issue, run at 921k, USB can do it very well, but AMA0 will lose data.

@pa7lim
Copy link

pa7lim commented Jul 9, 2019

I also have this issue. Shorten the GPIO14/15 with a jumper on a RPI3 and used this serial test tool: https://github.com/cbrake/linux-serial-test

./linux-serial-test -s -e -p /dev/ttyAMA0 -b 921600
Linux serial test app
Error, count: 26, expected 1a, got 22
Error, count: 3931, expected 63, got 6b
Error, count: 73746, expected 22, got 23
Error, count: 73747, expected 24, got 2a
Error, count: 93148, expected f3, got fb
Error, count: 108664, expected 97, got 9f
Error, count: 112551, expected ce, got d6
Error, count: 166872, expected 07, got 0f
Error, count: 170751, expected 36, got 3e
Error, count: 182391, expected b6, got be
Error, count: 248328, expected 4f, got 57
Error, count: 256089, expected a8, got b0
Error, count: 271602, expected 49, got 51
Error, count: 306512, expected af, got b7
Error, count: 376334, expected 75, got 7d
Error, count: 395729, expected 40, got 48
Error, count: 399612, expected 73, got 7b
Error, count: 419003, expected 3a, got 42
Error, count: 469454, expected 55, got 5c
Error, count: 477209, expected a7, got af
Error, count: 488843, expected 21, got 29
Error, count: 504360, expected c6, got ce
Error, count: 539267, expected 29, got 31

@pelwell
Copy link
Contributor

pelwell commented Jul 10, 2019

This is a very high error rate. At 921600 baud with no flow control I get an error (usually a cluster of 8 dropped bytes) every few million iterations on 3B+ and 4B (3B should be no different). Are you sure you don't have "console=ttyAMA0" or "console=serial0" in /boot/cmdline.txt?

@pelwell
Copy link
Contributor

pelwell commented Jul 10, 2019

I have noticed that the error rate increases when the CPU gets hot, but the mechanism is currently unknown.

@pelwell
Copy link
Contributor

pelwell commented Jul 10, 2019

Locking the ARM clock at a lower speed:

sudo sh -c "echo powersave >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor"

has a greater effect on the error rate, so the increase in errors/dropped bytes appears to be caused by a reduced ARM speed rather then an increased temperature.

@pelwell
Copy link
Contributor

pelwell commented Jul 10, 2019

The output has a very stable baudrate, and there are no data corruptions, so any losses are in the emptying (or not) of the RX FIFO.

@bg3mdo
Copy link

bg3mdo commented Jul 10, 2019

Anyone would like to try a real-time kernel? is the thermal throttling producing the problem? Suddenly dropping the CPU speed causes no enough time to transfer data out from FIFO? - lose data...

Looking at the errors from the above, something we need to note:

Error, count: 3931, expected 63, got 6b

63 0110 0011
6B 0110 1011

Error, count: 73746, expected 22, got 23

22 ‭0010 0010‬
23 ‭0010 0011‬

........

Data was captured by the IO, but a lot of bit flaps. I am interested in to capture the data line waveform to do a further analysis : )

@pelwell
Copy link
Contributor

pelwell commented Jul 10, 2019

The error turns out to be that data is being written into a full TX FIFO, causing it to be dropped. There is some optimising logic in there that assumes, not unreasonably, that it is always safe to write into the FIFO after the TX interrupt has fired. For some reason this appears to not be the case.

The simple fix is to disable the optimisation and check the FIFO level before every write - I've left a board running this test overnight to see how effective it is - but we ought to be able to do better.

@pelwell
Copy link
Contributor

pelwell commented Jul 11, 2019

The overnight test lost 16 bytes (or possibly pairs of bytes) in total out of nearly 5 billion bytes sent, so writing to a full FIFO is definitely the cause of the data loss. This morning I found the failure mechanism - the RX interrupt handler releases the lock, allowing a transmit thread on another core to jump in and fill the FIFO before the TX interrupt handler tries to write its half-a-FIFO of data (assuming there is any left to write).

A fix has been pushed to rpi-4.19.y - you can read the details here: raspberrypi/linux@9bf5cd2

popcornmix added a commit that referenced this issue Jul 15, 2019
kernel: i2c: bcm2835: Set clock-stretch timeout to 35ms
See: raspberrypi/linux#3064

kernel: xhci: add quirk for host controllers that don't update endpoint DCS
See: raspberrypi/linux#3060

kernel: tty: amba-pl011: Make TX optimisation conditional
See: #1017

kernel: overlays: Add real parameters to the rpi-poe overlay
kernel: overlays: Correct gpio-fan gpio flags for 4.19
See: raspberrypi/linux#2715

kernel: overlays: i2c-gpio: Fix the bus parameter
See: raspberrypi/linux#3062

kernel: overlays: Rename pi3- overlays to be less model-specific
See: raspberrypi/linux#3052

firmware: dispmanx: Fix handling of disable_overscan to not disable it totally
See: raspberrypi/linux#3059

firmware: power: Enable/disable H264 and ISP clocks with domain

firmware: arm_loader: arm_64bit=0 should disable loading of kernel8.img

firmware: dt-blob: CM has no activity LED
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this issue Jul 15, 2019
kernel: i2c: bcm2835: Set clock-stretch timeout to 35ms
See: raspberrypi/linux#3064

kernel: xhci: add quirk for host controllers that don't update endpoint DCS
See: raspberrypi/linux#3060

kernel: tty: amba-pl011: Make TX optimisation conditional
See: raspberrypi/firmware#1017

kernel: overlays: Add real parameters to the rpi-poe overlay
kernel: overlays: Correct gpio-fan gpio flags for 4.19
See: raspberrypi/linux#2715

kernel: overlays: i2c-gpio: Fix the bus parameter
See: raspberrypi/linux#3062

kernel: overlays: Rename pi3- overlays to be less model-specific
See: raspberrypi/linux#3052

firmware: dispmanx: Fix handling of disable_overscan to not disable it totally
See: raspberrypi/linux#3059

firmware: power: Enable/disable H264 and ISP clocks with domain

firmware: arm_loader: arm_64bit=0 should disable loading of kernel8.img

firmware: dt-blob: CM has no activity LED
@pietervandermeer
Copy link

The overnight test lost 16 bytes (or possibly pairs of bytes) in total out of nearly 5 billion bytes sent, so writing to a full FIFO is definitely the cause of the data loss. This morning I found the failure mechanism - the RX interrupt handler releases the lock, allowing a transmit thread on another core to jump in and fill the FIFO before the TX interrupt handler tries to write its half-a-FIFO of data (assuming there is any left to write).

A fix has been pushed to rpi-4.19.y - you can read the details here: raspberrypi/linux@9bf5cd2

I ran into RX FIFO overruns recently @ 460800 baud, pumping about 16 kByte/s in bursts of 16..23 bytes. Using ioctl(mcu_uart, TIOCGICOUNT, &icount); .. I noticed the overruns. But this only happened on 4.19.58 with RT-PREEMPT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants