Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPIO: polling a keyboard crashes the kernel after an hour or two... #1249

Closed
christophe-dupriez opened this issue Jan 4, 2016 · 34 comments
Closed

Comments

@christophe-dupriez
Copy link

In Python, I poll rows and columns of a numerical keypad using GPIOs without added resistors
(I use the input pullups).
I followed this article:
http://www.instructables.com/id/Using-a-keypad-with-Raspberry-Pi/

_As the bug happens without key presses, one can test on any RPi even without any keypad plugged._

The GPIOs I use for rows and columns are:
ROW = [18,23,24,25]
COLUMN = [4,17,22]
After an hour or two, the kernel crashes badly using either "Rpi.GPIO" or "pigpio" (pigpio may be a bit longer to crash). The crashes are never the same and even the console frame buffer may be garbled.
I attach the keypadClass.py (.txt to make it accepted by GitHub) and keytest.py
I also attach photos of the Rpi console...

My question is: " what kind of workaroung could I follow to protect the Kernel from crashing when polling a keyboard ? "

Thanks a lot for your help !

Christophe

a0131
a0144

keypadClass.py.txt

keytest.py.txt

@pelwell
Copy link
Contributor

pelwell commented Jan 4, 2016

A few questions:

  1. How good is your power supply? Have you ever seen the under-voltage warning?
  2. Are you overclocking? If so, what are your settings?

@christophe-dupriez
Copy link
Author

Thanks pelwell !

I am not overclocking.
Very few and usual configuration options from a standard "headless" Raspbian.
I2C is activated (and working well). No SPI. Serial is used.
Like you, I suspected the power supply. But, I am pretty sure of it now:

  • 12 V 3amps (with 7Ah battery backup)
    • 12V->5V 3amps DC/DC converter
  • I measured always less than 350ma taken by the Rpi
  • The problem arise also with simple wall plug power supply (Rpi certified, sold by Element14)

The whole application works perfectly when I do not poll the keypad.
A reduced application (only polling the keypad) is failing like the complete one.

I tested with two different RPi 2 model B.
I also reinstalled all the software on a new SD card (Sandisk).

Have a nice day!

@christophe-dupriez
Copy link
Author

keypadClass2.py.txt
Looking at the Linux driver to read a keypad using a GPIO Matrix
http://lxr.free-electrons.com/source/drivers/input/keyboard/matrix_keypad.c
I modified my code to mimic it (see keypadClass2.py.txt NOW attached).
The only difference with Linux driver algorithm is that I set internal pull-down on input pins.

It crashes the Kernel even faster !

Pull-downs could be the problem for the Kernel ?
(by the way, firmware and Raspbian are at their latest version)

@christophe-dupriez
Copy link
Author

My issue seems similar to:
https://www.raspberrypi.org/forums/viewtopic.php?f=66&t=108461
Does somebody would have a solution / work around ?

@joan2937
Copy link

joan2937 commented Jan 4, 2016

How are you using pigpio? Could you link to a full listing? pigpio C does allow the use of the SYSFS interface but it is discouraged except in exceptional circumstances which do not apply in your application.

@christophe-dupriez
Copy link
Author

Dear Joan2937,

In keypadClass.py.txt (1st message) and keypadClass2.py.txt (NOW 2 messages above) you will find the calls to pigpio Python interface.

I also tested Rpi.GPIO with the same problems (may be a bit worse)

_As the bug happens without key presses, one can test on any RPi even without any keypad plugged._

@christophe-dupriez
Copy link
Author

The answer to the question below may be the explanation + solution:
http://raspberrypi.stackexchange.com/questions/37397/rpi-gpio-mysteriously-crashing-when-setting-up-pin
It recommends to avoid Rpi.GPIO and "pigpio" which are accessing "/dev/mem/..." and to use WiringPi + its Python wrapper (using /sys/class/...). I will have a try tomorrow. If true, creators of RpiGPIO and "pigpio" should investigate how to strengthen up their software...

@joan2937
Copy link

joan2937 commented Jan 5, 2016

@christophe-dupriez As the author of pigpio I'll point out that is rubbish. pigpio makes more use of /dev/mem than wiringPi which makes more use of /dev/mem than RPi.GPIO.

@christophe-dupriez
Copy link
Author

Please excuse me, I did not meant to be rude in any way.

_What I know:_ using Rpi.GPIO or pigpio, I have a Kernel Crash.
I spent weeks thinking it was an I2C or a power problem and beefing up the hardware.
I could not imagine to have problems with something basic like GPIOs.
I may be wrong again.

_What I still have to test:_ using WiringPi, will it go better?
Before that, nothing objective, I agree.

Have a nice evening,

Christophe

@christophe-dupriez
Copy link
Author

Just had a look to WiringPi source code: it uses /dev/mem and recently began to use /dev/gpiomem too.
GPIOMEM advantages and availability is discussed here:
#1112

@christophe-dupriez
Copy link
Author

Kernel crashes as fast with /dev/gpiomem (latest WiringPi2) than with /dev/mem ...
The problem is therefore very low level as it is common to Rpi.GPIO, pigpio and WiringPi2, /dev/mem and /dev/gpiomem

I stopped changing pin modes from output to input (please look attached file keypadClass3.py.txt) and the Kernel seems to resist for now (3 hours later): wait & see after some days...

keypadClass3.py.txt

@christophe-dupriez
Copy link
Author

Joan, as written here:
https://www.raspberrypi.org/forums/viewtopic.php?f=107&t=130903
it runs now a lot better without input/output mode switch.
I attach the code fitted for pigpio and working well for now (will see after a day)
keypadClass4.py.txt
I will stick to pigpio (and pigpiod) as it is the most robust and flexible architecture.
I also attach a photo of the setting...
cxz0ui-weaaa0ff 1

@christophe-dupriez
Copy link
Author

So after 36h, the Rpi is still up & running.
This means that the latest Kernel crashes after about an hour with the following combination:

  • accessing /dev/mem or /dev/gpiomem 20 times per second
  • alternating 3 GPIOs from output to input+pulldown.

My workaround being to alternate output for columns from 1 to 0, the danger is to shortcut two GPIOs when two button are pressed together on the same row: I will have to add resistors to my PCB: 1K ohms to limit current to 3ma when 2 keys on the same row are pressed at same time.

@lurch
Copy link
Contributor

lurch commented Jan 18, 2016

I suspect a minimal test script (just a single .py file) that uses RPi.GPIO (which is the 'officially supported' GPIO python interface) that reliably causes the kernel to crash, will be a great help to the kernel developers in diagnosing this problem.

@christophe-dupriez
Copy link
Author

Now it has been running for days without a hitch (KeypadClass4.py.txt above). I suspect mode change (input+pull-down / output) to be the problem: so the failing python file cannot be really simpler than KeypadClass3.py.txt (one to two hour before failure) provided above. I will modify it for RPi.GPIO Wednesday or Thursday, which is rather a straightforward task. (Suffix ".txt" added to the file name to make it accepted by this forum)

@lurch
Copy link
Contributor

lurch commented Jan 18, 2016

I suspect if you can make your standalone-test-script crash the kernel in a shorter time-period (maybe by doing your mode-switches at a faster rate?) that would be useful too ;-)

@christophe-dupriez
Copy link
Author

After "Fail Fast", "Crash Fast" ? I'll do my best! More seriously, I think it is a question of timing for the different operations. Like receiving an input interrupt when already switched to output mode. Or something related to setting input mode and then only the pull-down (would it be possible to switch both at the same time?)... To clear this issue (or document a good workaround) will be beneficial for everyone and especially me as I would not be obliged to add resistors to protect output GPIOs from being shortcircuited.

@christophe-dupriez
Copy link
Author

Attached is a version of the program without sleep (except between write and read) and using the latest RPi.GPIO.
RPi.GPIO (and whole Raspbian) was upgraded today using apt-get update + upgrade

It has crashed my Kernel in 15 minutes on a RPi 2 model B: I hope it is fast enough...
Please do not hesitate to ask if I can be of any help to the developpers community.

keypadClass5.py.txt

@PeterPablo
Copy link

@christophe-dupriez, did you make any progress?

Today I had a crash on the current rpi-update 4.4 kernel on a 2B with twelve python scripts concurrently reading the same GPIO pin with 8 Hz. So this single GPIO pin got readout an average 8*12 times per second. Unfortunately I only had a serial console attached and couldn't obtain the kernel crash. I'll investigate tomorrow in more detail, but hoped to get an advantage by posting my current situation. I am running without overclock and the official power supply, though have max_usb_current=1 enabled as there are 2 passive usb hubs attached to the Pi that again are connected to a total of twelve RS2323-USB-adapters. I'll first try to make the switch over to active USB hubs.

@christophe-dupriez
Copy link
Author

@PeterPablo What really solved my problem was to change the polling of GPIO for the keypad matrix so the pinmode is NEVER changed (I always read the 4 rows and send signal to the 4 cols). I suspect the GPIO Linux driver has some kind of interrupt being pushed too often if a pinmode change happens at some bad moment. But I do not have the time and the background needed to dig in the Linux GPIO driver code...

@PeterPablo
Copy link

Thank you for your instant response. I never change the pinmode, so my issue might be unrelated. I'll investigate, whether my python script could be changed to an interrupt kind of approach. My goal is to observe a rather slow TTL-like signal concerning it's state.

@pelwell
Copy link
Contributor

pelwell commented May 11, 2016

I never got the kernel to crash when I tried - reproducibility has always been the problem.

@christophe-dupriez
Copy link
Author

About 10 times per second, I change the value of 4 GPIO pins and read the value of 4 others. Removing the pinmode changes completely stabilized the system. @pelwell : 20th of January I attached above a version which was crashing after a quarter: did you tested it ?

@pelwell
Copy link
Contributor

pelwell commented May 11, 2016

I did, but it didn't work for me.

@christophe-dupriez
Copy link
Author

Oh Oh! This means I should try it also in different hardware configurations...

@PeterPablo
Copy link

Oh well, so I gave this another spin today and had reproduceable lockups after less then one minute, even when the USB hub was active (own power supply).

@christophe-dupriez, your comment about NEVER changing the pinmode made me anxious and so I double checked my python script. I have two functions prepareGPIO and readGPIO. By mistake I had the line GPIO.setup(pin, GPIO.IN) inside readGPIO and thus issued the pinmode instruction about 100 times per second! After the simple change of moving this line to prepareGPIO my setup has now been running for six hours and will hopefully continue to do so over night.

@PeterPablo
Copy link

As my problematic version of the script would
a) set the pinmode
b) read the pin
and the script was run in several independent python instances from the console, this could potentially have lead to the situation where instance (1) would be in the state of setting the pinmode (a) and instance (2) was trying to read the pin (b).
@pelwell, could this potentially cause a crash?

@christophe-dupriez
Copy link
Author

I am happy that publishing my problems saved some time to somebody else! As I am not using multiple process in parallels, I suspect the issue could be around interrupt processing at a low level in the GPIO driver, where somehow mode change could interfere with pin reading and/or writing. For instance, receiving an interrupt for an input pin change when the mode just changed for "writing". It could be even hardware interrupts not returning for some reason and stacking themselve as the crash is so violent, so rare and corrupting memory so badly. I have no time before Monday to try reproduce the problem on a RPi 3 without any connection with GPIO instead of RPi 2 with a keypad connected.

@PeterPablo
Copy link

PeterPablo commented May 13, 2016

My test ran for 24 hours without problems! Great!

The message I once saw on the serial terminal was gpiomem-bcm2835 3f200000.gpiomem: gpiomem device opened. Another user reported this kernel error on a german forum, see here. This person seems to have obsered a regression from 4.1.17 vs. 4.1.13.
edit: In his second post his reasoning is, that the problem seems to happen when an interrupt happens during a read operation.

@Ruffio
Copy link

Ruffio commented Aug 17, 2016

@christophe-dupriez has your issue been resolved? If so, please close this issue. Thanks.

@christophe-dupriez
Copy link
Author

I did not heard anybody claiming the GPIO driver has been corrected. But I will test with a freshly updated machine...

@PeterPablo
Copy link

On 4.4 I still see the message gpiomem-bcm2835 3f200000.gpiomem: gpiomem device opened. in syslog when setting up the GPIO interrupt from within my python script. The script runs well though for multiple days, so may be this message could/should be downgraded in severity/not be printed?

@PeterPablo
Copy link

The message is removed in the current kernel (c376c9f).

On topic -- I reliable solved the kernel crashes by switching from "frequent polling/readout of the current GPIO state from several python processes" to "setting up the same interrupt in several python processes and only determining the GPIO state inside that function, which happens a lot less frequency".

@JamesH65
Copy link
Contributor

Closing this issue as questions answered/resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants