You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using a Beaglebone black SBC as a data collector for a solar thermal installation and have been running into a hang issue.
I am using a python program for collecting and preprocessing ADC measurements of temperature and motor currents in conjunction with a high level web enabled GUI front end. The Python program is spawned as a service by this program. The Python program reads ADC values from six different channels every 2 seconds. (5 conversions per channel every 2 seconds = 30 conversions)
Problem: After running correctly for anywhere from 3 to 6 weeks it hangs and no longer sends data to the GUI app.
Once hung, the Python program cannot be killed even with a "sudo kill -9" command. When in this hung state, attempting to start the program manually in a terminal results in an immediate hang and ^C does not work, the terminal is dead. Now there are two copies of the program running that cannot be killed. The only way out of this condition is a reboot. Once rebooted, manually running this program works correctly as does the GUI front end. Note that two copies of the ADC collector program are never run at the same time.
The rest of the machine still operates normally but the ADC cannot be accessed again once hung.
Question, is there a fail safe timer running in the ADC.read() function while waiting for end of conversion?
Since I cannot get any access to the Python program once hung, I am not sure how to attack debugging this issue.
FYI, I have noticed that the GUI front end appears to kill and restart the data collection program once each 24 hour period. When issuing "systemctl status MCJEDI" it shows a new process ID for the collector program for each day it has been running. It is possible this restart operation is causing the issue if it interrupts the ADC procedure at the wrong time. However, a program should not hang endlessly for any reason.
Additional information:
These random hangs continue. I just upgraded from BBIO version 1.0.10 to 1.2.0 and time will tell as it may take months to hang again.
Observations from hung state:
sudo reboot now command does not complete successfully (Must power cycle to recover)
sudo shutdown now never completes (Must power cycle to recover)
directly writing to the ADC control register (disable and power down bits) does not cause it to recover.
None of the kill commands can terminate the hung Python program.
Only a complete power removal resolves the issue.
See above original post. Python runs correctly for any program that does not use BBIO but hangs when attempting to use the ADC via BBIO.
I am using a Beaglebone black SBC as a data collector for a solar thermal installation and have been running into a hang issue.
I am using a python program for collecting and preprocessing ADC measurements of temperature and motor currents in conjunction with a high level web enabled GUI front end. The Python program is spawned as a service by this program. The Python program reads ADC values from six different channels every 2 seconds. (5 conversions per channel every 2 seconds = 30 conversions)
Problem: After running correctly for anywhere from 3 to 6 weeks it hangs and no longer sends data to the GUI app.
Once hung, the Python program cannot be killed even with a "sudo kill -9" command. When in this hung state, attempting to start the program manually in a terminal results in an immediate hang and ^C does not work, the terminal is dead. Now there are two copies of the program running that cannot be killed. The only way out of this condition is a reboot. Once rebooted, manually running this program works correctly as does the GUI front end. Note that two copies of the ADC collector program are never run at the same time.
The rest of the machine still operates normally but the ADC cannot be accessed again once hung.
Question, is there a fail safe timer running in the ADC.read() function while waiting for end of conversion?
Since I cannot get any access to the Python program once hung, I am not sure how to attack debugging this issue.
FYI, I have noticed that the GUI front end appears to kill and restart the data collection program once each 24 hour period. When issuing "systemctl status MCJEDI" it shows a new process ID for the collector program for each day it has been running. It is possible this restart operation is causing the issue if it interrupts the ADC procedure at the wrong time. However, a program should not hang endlessly for any reason.
git:/opt/scripts/:[1aa73453b2c980b75e31e83dab7dd8b6696f10c7]
eeprom:[A335BNLT00C03919BBBK01F8]
model:[TI_AM335x_BeagleBone_Black]
dogtag:[BeagleBoard.org Debian Image 2018-10-07]
bootloader:[eMMC-(default)]:[/dev/mmcblk1]:[U-Boot 2018.09-00002-g0b54a51eee]:[location: dd MBR]
kernel:[4.14.71-ti-r80]
nodejs:[v6.14.4]
uboot_overlay_options:[enable_uboot_overlays=1]
uboot_overlay_options:[uboot_overlay_pru=/lib/firmware/AM335X-PRU-RPROC-4-14-TI-00A0.dtbo]
uboot_overlay_options:[enable_uboot_cape_universal=1]
pkg check: to individually upgrade run: [sudo apt install --only-upgrade ]
pkg:[bb-cape-overlays]:[4.4.20180928.0-0rcnee0
stretch+20180928]stretch+20180517]pkg:[bb-wl18xx-firmware]:[1.20180517-0rcnee0
pkg:[kmod]:[23-2rcnee1
stretch+20171005]stretch+20181005]pkg:[librobotcontrol]:[1.0.3-git20181005.0-0rcnee0
pkg:[firmware-ti-connectivity]:[20170823-1rcnee1~stretch+20180328]
groups:[debian : debian adm kmem dialout cdrom floppy audio dip video plugdev users systemd-journal i2c bluetooth netdev cloud9ide gpio pwm eqep admin spi tisdk weston-launch xenomai]
cmdline:[console=ttyO0,115200n8 bone_capemgr.uboot_capemgr_enabled=1 root=/dev/mmcblk1p1 ro rootfstype=ext4 rootwait coherent_pool=1M net.ifnames=0 quiet]
dmesg | grep pinctrl-single
[ 1.119748] pinctrl-single 44e10800.pinmux: 142 pins at pa f9e10800 size 568
dmesg | grep gpio-of-helper
[ 1.131736] gpio-of-helper ocp:cape-universal: ready
END
This script should be present for any image downloaded from:
https://beagleboard.org/ or https://rcn-ee.com/
The text was updated successfully, but these errors were encountered: