Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ethernet, USB, reboot issues in kernel 3.18.14, 4.1.6 #1124

Closed
KJee85 opened this issue Sep 3, 2015 · 5 comments
Closed

Ethernet, USB, reboot issues in kernel 3.18.14, 4.1.6 #1124

KJee85 opened this issue Sep 3, 2015 · 5 comments

Comments

@KJee85
Copy link

KJee85 commented Sep 3, 2015

Been having some issues with the above. Let me describe the setup and then I'll get into the issues.

Setup:

Raspberry Pi 2
GrovePi+
2 USB Sound Cards
Most Ethernet, some WiFi (RT5370). Power supply is 2.5A.

Issues:

  1. Reboot

    • They were running fine for 5 or so days on 3.18.14. Had to push an update to our software that polls the sensors (nothing too crazy here 30s poll rate). Gave them a reboot and about 3 pis out of a few dozen did not come back up. Not pingable, we run a simple heartbeat to our service as well, nothing, seemed like they just did not come back up.
    • Had to hard reset them. This is difficult given we have limited physical access to them. We decide to hold off on doing any reboots unless absolutely necessary to avoid this issue. Looked through the logs of the 3 that went down and there was nothing that would indicate what happened. I have a cron job that runs every min. and pings a site and spits out if internet is down or not, and if down does a ifdown/ifup (had this because ones of WiFi would flake our for whatever reason). I saw this running fine until reboot time, then nothing in the logs until the hard reset where I just saw the cron log start again, no boot logs from the hard reset etc. Add the bcm_2708 watchdog after this to potentially mitigate if they got stuck or panicked at some point for whatever reason.
  2. Ethernet / USB

    • Again they were running fine for about a week and we started to see some of them drop. No network, even connected some of them to a separate router/network to see if it was the switch they were on or something and they would not show up. We again hard rest them. This time we actually see something in the logs.
    Aug 23 23:50:08 modpi013 kernel: [710771.408676] ERROR::handle_hc_chhltd_intr_dma:2202:   handle_hc_chhltd_intr_dma: Channel 2, DMA Mode -- ChHltd set, but reason for halting is unknown, hcint 0x00000002, intsts 0x06200009
    Aug 23 23:50:08 modpi013 kernel: [710771.408676]
    Aug 23 23:50:11 modpi013 kernel: [710774.407057] smsc95xx 1-1.1:1.0 eth0: Failed to read reg index 0x00000114: -71
    Aug 23 23:50:11 modpi013 kernel: [710774.407532] smsc95xx 1-1.1:1.0 eth0: Error reading MII_ACCESS
    Aug 23 23:50:11 modpi013 kernel: [710774.407668] smsc95xx 1-1.1:1.0 eth0: Timed out reading MII reg 01
    
  3. Resolutions

    • Obviously there has been a lot of people with issues around this and other USB things I thought this was supposed to be solved by this kernel but maybe not for USB Video/Audio. Given our current options we decide to update them to 4.1.6. I do a apt-get upgrade and a rpi-update. I also add to the boot/cmdline.txt => dwc_otg.fiq_fix_enable=1. I couldn't find anything clear as to if this was default in 3.18.14 or not but I figured I would make sure to enable it as people with similar issues had seen improvements.
    • Started rolling out the update and so far the ones that are updated seem to be fine, its only been a couple days though. Again there were two so far that have not come back up after the update probably similar to the reboot issue above, have not reset them yet to check their logs.
    • The reason we think this is something in the OS is that it is not consistent, and so far unable to be reproduced by us on test rigs. We have some other Pis that are running extreme versions of our setup. 4 USB Sound cards, sensor polling every 5s, and rebooting every 5min and so far for over 2 days these two have been stable, so there must be some rare case that we are hitting on the 5 or so we have seen the various issues on.

Has anyone had this happen, even if its not an exactly similar setup my guess is if your'e running USB Audio / Video you may run into this issue. Are these known issues that is being worked on in the next release? I'm open to any ideas, let me know if I can pull more data in that would help diagnose.

@popcornmix
Copy link
Collaborator

What are the sensors? There was a recent fix for I2C causing crashes:
raspberrypi/firmware#192

@KJee85
Copy link
Author

KJee85 commented Sep 3, 2015

Grove PIR Motion
Grove Temperature (DHT11)
Grove Light Sensor

These are all Analog/Digital, and communication happens to them via the GrovePi+. The GrovePi+ uses I2C though to communicate back to the Pi so I guess this could be related. Would this explain the reboot issue though? As we have seen that occur when we rebooted them after doing a rpi-update yesterday.

@popcornmix
Copy link
Collaborator

Recent firmware could well fix a crash related to I2C. I don't know about the reboot issue.
dwc_otg.fiq_fix_enable=1 is not required (fiq is enabled by default).

@Ruffio
Copy link

Ruffio commented Aug 17, 2016

@KJee85 has your issue been resolved? If so, please close this issue. Thanks.

@P33M
Copy link
Contributor

P33M commented May 4, 2017

Closing USB issues as OP has not updated the thread. Further comments by the OP stating that the issue is still present on latest rpi-update kernel will lead to a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants