Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel update warning for RPi3B #1265

Closed
E3V3A opened this issue Apr 18, 2018 · 8 comments
Closed

Kernel update warning for RPi3B #1265

E3V3A opened this issue Apr 18, 2018 · 8 comments

Comments

@E3V3A
Copy link
Contributor

E3V3A commented Apr 18, 2018

I just updated the kernel on my RPi3B (not the plus) from 4.9.80 to 4.14.30-v7+ or above.
Unfortunately, this has some very unpleasant side-effects.

  • For many already working power supplies, the kernel log messages are now spammed with low voltage as a critical message in /var/log/kern.log. This will undoubtedly cause both excessive SD card wear, read/write failures for high load devices. (Sure if your PSU is rated 2.5A @ 5V+ you should be ok and see less messages.) Unfortunately if you're a frequent user of dmesg -e -x you're out of luck until there is patch for not spamming the /dev/kmsg and /proc/kmsg.

  • One processor core is constantly running 100+ % CPU usage. (yeah, often show more than 100 % LOL!) from the updated lxpanel --profile LXDE-pi process. (Check with htop).

  • Someone seem to have enabled the apt-get upgrade in a cron script, so now you can expect your OS and firmware to automagically upgrade your system without your interaction. See:

less /var/log/syslog |grep apt
cat /etc/cron.daily/apt-compat
cat /usr/lib/apt/apt.systemd.daily
  • Unfortunately, there are also loads of other bloated services now running after update. You need to go through all of them and see if it's something you will ever use.

  • Screen on/off/blanking using xset -display :0 dpms force [on/off] is no longer working, reason unknown...

  • BlueTooth seem re-enabled even though I had it turned off previously. (need investigation)

  • It also tries to outsmart you by over writing your /home/pi/.asoundrc ALSA config file!

So unless, you have the new RPi3 B+ (plus), I strongly recommend against upgrading your device until further investigation and tests have been done. This is especially true if you have a MM running with a lot of custom modules and other cron or HW automation hacks.

@E3V3A
Copy link
Contributor Author

E3V3A commented Apr 24, 2018

Just a few updates:

  • Kernel/Firmware updates to anything above 4.9.80 will brake loads of different functionality in your RPi3B/+, even if fixing a few others. Although it has been claimed to be tested for months, nobody tests this as well as the MM community, as we use such a wide variety of RPi functions. Please check the link above before deciding to update.

We are talking about important things like:

  • External USB/PXE/netboot boot functionality
  • System performance and lifetime reduction by excessive kernel log spam.
  • Various weird WiFi problems. ( here and here )
  • Screen blanking using xset is broken
  • SD card writing problems. here (and there will surely be more of those...)
  • SAMBA automount broken here.
  • Video playing issues: here
  • Possible Camera Module issues: here
  • Random C-media USB sound card issues: here and here

That said, if you have already updated, and want to rid all the undervoltage warnings (and the annoying on-screen flashes) from /var/log/kern.log and dmesg, then add the following to your /boot/config.txt: avoid_warnings=2.

If you don't want to do this, then at least you should make sure that the these logs are not written to the SD card in /var/log/.log*, by editing the custom syslog file: /etc/rsyslog.d/ignore-underpowering.conf with the line:

:msg, contains, "oltage" stop

@nhubbard
Copy link
Contributor

@E3V3A This looks to be fairly serious! @MichMich Should we make an announcement/recommendation on the forums?

@6by9
Copy link

6by9 commented Apr 25, 2018

(Responding as you've tagged so many issues which therefore get referenced).
Wow, way to overegg things!

Yes there are a few issues showing up, but:

So how many are really relevant to your use case?

@E3V3A
Copy link
Contributor Author

E3V3A commented Apr 29, 2018

@nhubbard
It's probably not big news that larger steps in updating kernel and firmware tend to have some side effects, the problem is that most users only find out after the fact and remain clueless how to fix it, and not able to find any solutions. For some reason the RPi foundation together with the Debian package maintainers have decided that writing changelogs is below their level of pride. So there are no proper change-logs found anywhere. This also means that bug in-development branches are not shown either and most go unseen until release.

That said I want to make clear that I don't only mean Kernel and Firmware. Because usually these come with a wide range of other package updates that are depending on them second hand.

@6by9

Thanks for clarifying some issues. However, saying that they are only present on older Kernels can't be right, because if they are no longer present and have been fixed in new releases, those issues should have been closed.

Regarding what's relevant. If you want specific usage scenarios you may want to ask some of the at least 6000+ users of MM, and see what their specific setups are. Most of them never update. Some of them try to update often. People are trying to optimize and run MM and all sorts of modules on all sorts of HW configurations, even including RPiZ's. So any issue at all may be serious to them.

Regarding 2512, the rate limiting have the very nasty side effect of not reporting anything useful at all in demsg, since it limits not only the silly under-voltage warning, but also all other kmsg messages! The only way to see the other messages is by using journalctl -b, but that is not clear either, since the patch was never widely tested. That is the way the RPi org does it. "Hey, let's just use the community as guinea pigs and if there aren't any immediate complaints then it's probably ok, and we don't need to test ourselves. And if there is a reported problem we'll solve it later with a new update." Leaving the average MM user (who is often not very skilled in any form of OS hacking) completely out in the cold and on his own.

Regarding 2517, the xset screen blanking issue is very funny and not easily debugged. There are dozens of issues that sound very similar and related, but without more understanding they can't even be linked to that one. Also I doubt that anyone "updates random packages", but it seem that some RPi devs do, since they never tell us what exactly is getting updated. It may very well be that this is not the fault of the RPi developers, but that is impossible to know, since the community is not informed by any available standard methods, what has changed. This make debugging anything outside the trivial, a real PITA.

For example, another issue I've looked at, is the over writing of the users .asoundrc ALSA configuration, after each reboot. Apparently that issue is due to the lxpanel volume applet... This has probably nothing to do with the RPi linux guys, but nevertheless the issue only appeared after the OS upgrade. How is a user supposed to know that? Or even where to report such a problem?

So yes, the warning is well warranted and the links too, since there is no other sensible place to post them.

@6by9
Copy link

6by9 commented Apr 29, 2018

Thanks for clarifying some issues. However, saying that they are only present on older Kernels can't be right, because if they are no longer present and have been fixed in new releases, those issues should have been closed.

I didn't say that. You were blaming the upgrade from 4.9.80 to 4.14.30-v7+ as being the cause of all your issues. I was stating that those issues were also present on 4.9.80, therefore the upgrade to 4.14 is not the cause of the issue. People will see it on 4.9.80 too.

Regarding 2512, the rate limiting have the very nasty side effect of not reporting anything useful at all in demsg, since it limits not only the silly under-voltage warning, but also all other kmsg messages!

Nope, deliberately done to rate limit independently of any other rate limited messaging.

@E3V3A
Copy link
Contributor Author

E3V3A commented May 4, 2018

I managed to restore the xset screen blanking, but it relies on using:

xsetroot -display :0 -solid Black

This is most likely not permanent and need to be put in some file. The problem is that there are about 20 different x related configuration files, that all have background colors set... all spread around the FS each overiding the other! 🥂

It's a total mess. Nobody knows what is controlling what! 🥇

@northernmoose
Copy link

northernmoose commented May 17, 2018

Had to create an account to chime in on this one. Working on multimedia applications using the rasp. We were hoping to base our product on the RPi3B+ and in the process had to do an upgrade in order to boot it. Noticed several things along the way.

Starting with kernel/fw version 4.14.24-v7+ 1097 (rpi-update bdb826a8db75ba36d754bd71fb64d3905d3bd026) our application begins to behave strangely. It is a network streaming application with package resends, and starting from this fw/kernel version the application will not get packages in time. Stepping back to the previous version (2659c9e87b574b3b05eacef80961c404ed0f0ce3) and the problem is gone. Back to bdb826a8db75ba36d754bd71fb64d3905d3bd026 and it is there again. Nothing else changed except doing an rpi-update between the tests (done several times just to be sure).

I didn't manage to get the RPi3B+ to boot anything pre 4.14.26 without custom modifications. Noticed that the commit that gives us the problem is a set of firmware files ("Rework the frequency/voltage scaling logic") so I rolled back kernel/fw to a 4.14.24-version that does not have the issue, and tried getting the B+ to boot. I did manage to nail the problems down to GPU firmware, start.elf. I used a functioning system and replaced the start.elf file alone and then, at the same time, I could boot the B+ AND the problems are there. Switch to older start.elf, reboot (this is on an RPi3B), it works. Switch to start.elf from this version, reboot (still on RPi3B) - it is there.

Last version I tried was 4.14-39, 793a5f5466cea7fb8aee649e5a3782dba9e0db2d. It still has the same problems.

I continued using a 4.14 kernel on a regular RPi3B and have noticed some other strange new behavior. For some reason the application starts behaving flaky and the problem cannot be resolved by restarting the app. If I reboot the entire system it works again though. Nothing strange in the logs, haven't noticed anything unusal, but it is the same every time. Back to 4.9-79 as previously now, and haven't seen the problem. Yet...

We would have preferred basing the product on the B+, but with things in their current state it is simply not possible.

@MichMich
Copy link
Collaborator

Since this isn't an issue with the code of the framework, I'm closing this issue. Please use the forum for further discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants