Skip to content

Conversation

@michaellass
Copy link
Contributor

@michaellass michaellass commented Jan 18, 2020

Overview
This changeset is mainly meant as a fix for adafruit/Adafruit_Blinka#210. However, it now includes a couple of different changes:

  1. Add LICENSE file and header to clarify license situation.
  2. Unfortunately I had to reformat the source to work with it. My editor was not at all happy with the mixture of tabs and spaces. If this is a problem, I will try to get rid of this change.
  3. Fix a possible stack corruption when reading from the message queue.
  4. Due to reformatting, the order of includes changed, exposing some missing includes.
  5. Fix validation of integers coming in via the message queue.
  6. Fix 100% CPU use of libgpiod_pulsein on Raspberry Pi Adafruit_Blinka#210 using the approach described in the following.

Avoiding unneccessary busywaits

This PR extracts the sensor polling into its own thread. This allows us to reduce CPU use by 100% when paused, since both threads can be implemented blocking:

  • The main thread waits for input via message queue as before. However, since this is now its only job, we can make msgrcv blocking, so no busy waiting is required anymore.
  • The (new) polling thread performs the polling of the gpio line, time measurements etc. This still uses 100% CPU. However, when paused, we can block this thread on a barrier controlled by the main thread.

As a result, there is now 0% CPU use when paused.

Warning

I have only tested this with a simple DHT22 readout from Python using adafruit_dht. I have not tested anything else, in particular I did not test any of the debugging functionality (SIGTSTP, SIGTCONT, FOLLOW_PULSE, ...). Also the code should be reviewed for any possible race conditions.

@ladyada
Copy link
Member

ladyada commented Jan 18, 2020

what license do you want?

@michaellass
Copy link
Contributor Author

I would be fine with any well-established open source license, permissive or copyleft. I just want to avoid putting code onto other peoples machines without knowing what conditions apply (e.g., regarding liability etc.). Since Adafruit_Blinka is licensed unter MIT, that may be a natural choice.

I guess, @brennen would also need to agree to any license.

@ladyada
Copy link
Member

ladyada commented Jan 18, 2020

@brennen performed the work for hire - MIT license it is :) if you'd be so kind to add the MIT license header and LICENCE it would ultra appreciated

@michaellass
Copy link
Contributor Author

Great! 🙂 I'll do.

As discussed in adafruit#1, this project shall be licensed under MIT.
Auto-format code in order to
* only use spaces (don't mix spaces and tabs)
* remove the leading whitespace in all lines
* remove any trailing whitespaces

Unfortunately, the auto-format also applied some other
formatting-changes. There is however no functional change.

Best reviewed using `git show --ignore-all-space`.
msgrcv currently returns a msglen of up to VMSG_MAXSIZE. Writing to
vmbuf.message[VMSG_MAXSIZE] is an out-of-bounds access which may cause
stack corruption. Fix by accepting only messages of size VMSG_MAXSIZE-1.
Auto-formatting reordered the includes, exposing some missing includes.
We need to compare the interpreted argument and not offset to INT_MAX.
Extract the sensor polling into its own thread. This allows us to reduce
CPU use by 100% when paused, since both threads can be implemented
blocking:

* main thread: Waits for input via message queue. msgrcv is now
  blocking, so no busy waiting is required anymore.
* polling thread: Performs polling of gpio line. When paused, it waits
  at a barrier controlled by the main thread.
michaellass added a commit to michaellass/libgpiod_pulsein that referenced this pull request Jan 18, 2020
As discussed in adafruit#1, this project shall be licensed under MIT.
@michaellass michaellass changed the title [DO NOT MERGE] Reduce CPU use when paused and fix a couple of other issues Reduce CPU use when paused and fix a couple of other issues Jan 18, 2020
@michaellass
Copy link
Contributor Author

This PR currently seems to introduce a higher checksum error rate. Doing 1800 sensor readings (one every 2s), I get the following error rates:

  • old code: 89/1800 (aroung 5%)
  • new code: 495/1800 (around 28%)

So there is probably a regression somewhere which I need to locate before we can merge this.

@michaellass
Copy link
Contributor Author

Turns out that the higher error rates are caused by the fact that the CPU clocks down. Therefore the polling loop runs slower and the time resolution of our measurements is pretty bad. If one keeps the CPU busy, e.g. by running while true; do true ; done in another terminal, readout quality is back to normal.

I will further experiment with this. Maybe we can use the interrupts from the GPIOs to give us more accurate measurements. For now I suggest keeping this PR open as work-in-progress.

To get a decent time resolution on our measurements we need the CPU to
be clocked fast. Using an ondemand governor, this means that we need to
load the CPU before trying to do any measurements. So while unpaused, we
should do busy waiting whenever possible and when triggering, we should
start loading the CPU _before_ doing the actual trigger.
@michaellass
Copy link
Contributor Author

michaellass commented Jan 22, 2020

With the latest commit I'm back to below 4% error rate on a Raspberry Pi 4 B. The fix follows the following idea:

  • when paused we don't want to do any busy waiting to save CPU cycles and energy
  • when unpaused we want to keep the CPU busy all the time, in order to keep it at a high frequency so we get a sufficiently high time resolution of our measurements.

Since the ondemand governor is a bit sluggish, we start loading the CPU already 80ms before putting out the trigger signal.

The overall idea and the implementation of busy_wait_milliseconds is taken from https://github.com/adafruit/Adafruit_Python_DHT/blob/master/source/common_dht_read.c#L29.

A couple of questions remain:

  • What hardware does this code target? libgpiod is pretty generic. We should probably at least test this with Raspberry Pi 1 / Zero.
  • Using slow mode (!fast_linux) still seems to work in general. However, I sometimes get temperature readings of 0.0 instead of an Exception when things go wrong. Has this been the case before? What hardware is the slow mode targeting?

A side note for people trying to optimize this code in future: Relying on interrupts works in general and makes the code much more elegant. However, that approach gets rid of any busy-waiting and therefore makes the problem at hand even worse: the CPU will idle even more and signal transition events will be lost.

@ladyada
Copy link
Member

ladyada commented Jan 22, 2020

hi, it needs to work with pi 1 / zero for sure!

@michaellass
Copy link
Contributor Author

hi, it needs to work with pi 1 / zero for sure!

Makes sense. However, I don't have an rpi 1 / zero at hand to test. So I need to rely on others to test.

Do you know if the slow mode was designed for the rpi 1 or if it targets some other specific hardware?

@ladyada
Copy link
Member

ladyada commented Jan 22, 2020

just pi 1 - everything else is much faster!

@michaellass
Copy link
Contributor Author

I now also tested this on a Raspberry Pi 2 B. Measured error rates getting data from a DHT22:

  • version currently shipped with adafruit_blinka: 205/600 (34%)
  • latest version of this PR: 76/600 (13%)

So I guess the only thing left to do is to test with a Pi 1 / Zero. However, as far as I can see blinka actually does not use the slow mode, even on those devices: https://github.com/adafruit/Adafruit_Blinka/blob/master/src/adafruit_blinka/microcontroller/bcm283x/pulseio/PulseIn.py#L48

* remove dead code, e.g. sig_handler is only registered for SIGINT
* properly initialize mutexes
* add missing line break after --help output
* declare variables as volatile only if needed
* fix/add/cleanup comments about locking strategies
We should not acquire locks within a signal handler as this may cause a
deadlock. Make print_pulses lock-free and let callers acquire a lock for
the ringbuffer if required.
@kilianbr
Copy link

I'm sorry, if my question is a bit offtopic. I've got the exact same problems described here. Do I get those latest fixes of CPU usage just by installing the newest version of the Adadruit_DHT library?
Feel free to delete my comment, if it's the wrong place..

@michaellass
Copy link
Contributor Author

@kilianbr When this or some similar solution is merged, it will ultimately land in Adafruit_Blinka which contains a compiled version of libgpiod_pulsein (https://github.com/adafruit/Adafruit_Blinka/tree/master/src/adafruit_blinka/microcontroller/bcm283x/pulseio). Afterwards updating Adafruit_Blinka should fix the high CPU load. However, for now you would have to compile libgpiod_pulsein yourself from this branch and replace the binary included in Adafruit_Blinka manually.

@subhead
Copy link

subhead commented Feb 12, 2020

@michaellass are there any instructions available how to compile this PR?

@MaximKeegan
Copy link

MaximKeegan commented Feb 13, 2020

cd ~
git clone https://github.com/michaellass/libgpiod_pulsein.git
cd libgpiod_pulsein
git checkout -b cpu_fix
cd src
make
cd /usr/local/lib/python3.7/dist-packages/adafruit_blinka/microcontroller/bcm283x/pulseio/
cp libgpiod_pulsei libgpiod_pulsei.bak
cp ~/libgpiod_pulsein/src/libgpiod_pulsein ./

Something like this

@jannikac
Copy link

Just tested your PR on Raspberrypi Zero W, using the Method @MaximKeegan described. However putting the compiled binary in /home/pi/.local/lib/site-packages/adafruit_blinka/microcontroller/bcm283x/pulseio/libgpiod_pulsein since that is what htop tells me is using ~95% CPU. Doesnt seem to work. Compiled it, put it in place but still uses ~95% CPU.
Steps to reproduce:
Start python interpreter

import adafruit_dht
import board
dht = adafruit_dht.DHT22(board,D4)

Let me know if you need any logs.

@michaellass
Copy link
Contributor Author

There is a small but important error in the instructions above:

git checkout -b cpu_fix has to be git checkout cpu-fix. Otherwise you create a new branch from master which does not include the changes here. So correct should be:

cd ~
git clone https://github.com/michaellass/libgpiod_pulsein.git
cd libgpiod_pulsein
git checkout cpu-fix
cd src
make
cd ~/.local/lib/site-packages/adafruit_blinka/microcontroller/bcm283x/pulseio/
cp libgpiod_pulsein libgpiod_pulsein.bak
cp ~/libgpiod_pulsein/src/libgpiod_pulsein ./

@jannikac
Copy link

That fixed it, the process now only uses CPU when the sensor is being read. Great work!

Working on Raspberrypi Zero W

@michaellass
Copy link
Contributor Author

@jannikac Great! As far as I know you are the first one testing it on this class of device. Could you please keep an eye on how well this works for you over a couple of days (e.g., look for increased error rates) and report back?

@jannikac
Copy link

Let it run pretty much 24/7 with a script that regularly records temperature and sends it to a server. I occasionally logged in, the cpu usage seems nominal at all times.

@mfeif
Copy link

mfeif commented Feb 28, 2020

Any tips for compiling this beyond the illustration above? I don't think I have all the dependencies and whatnot on my pi; gpiod.h for example; not sure where to get that. (The github "home" page for this library doesn't have any instructions.)

Is it possible to offer instructions for cross-compiling on a faster computer, or for a package to be published?

Thanks for tracking down this problem for us!

@michaellass
Copy link
Contributor Author

Any tips for compiling this beyond the illustration above? I don't think I have all the dependencies and whatnot on my pi; gpiod.h for example; not sure where to get that. (The github "home" page for this library doesn't have any instructions.)

I think you are already pretty close. libgpiod is the only special requirement. To build something against that library, you need to install the -dev package:

apt install libgpiod-dev

Apart from that only gcc and make should be required.

@mfeif
Copy link

mfeif commented Feb 28, 2020

I was able to compile and replace the library. Thank you!

Unfortunately, it still pegs a CPU at 100%; I'm using the example code from the IR sensor library. It used to sit idle 'till there was some activity on the pin, and then it pegged and stayed there. Now, as soon as one runs the script it hits 100% and stays there.

Anything I can do to help gather data?

@michaellass
Copy link
Contributor Author

Unfortunately, it still pegs a CPU at 100%; I'm using the example code from the IR sensor library.

Can you provide a link to that library? The fix here only works if the sensor is explicitly paused. So as long as the IR library is waiting to receive something from the sensor, you will still see 100% CPU.

@mfeif
Copy link

mfeif commented Feb 29, 2020

The fix here only works if the sensor is explicitly paused.

Aha. That's very important to know! Thanks.

Here's the link: https://github.com/adafruit/Adafruit_CircuitPython_IRRemote

Any tips/strategies for getting CPU use down? Should I put a short sleep in my loop and pause the pulse while that's happening? I tried that, but it doesn't seem to help (even with a long wait of 1s); perhaps I don't understand what is meant by pausing.

@Saeven
Copy link

Saeven commented Apr 27, 2020

Anxious to get this module update, thanks for doing the work.

Copy link
Contributor

@makermelissa makermelissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on Pi 4 and Pi Zero W and saw reduced CPU usage in both. Great idea with threading.

@makermelissa makermelissa merged commit 19cff93 into adafruit:master May 7, 2020
@ladyada
Copy link
Member

ladyada commented May 7, 2020

thanks everyone, its great to see this fix go in :)

@jlegrand62
Copy link

Fantastic job! I was using Adafruit-Blinka-4.4.0 on a RasPi 1 and had issues with this 100% CPU usage!
Using pip, I updated to Adafruit-Blinka-4.8.1 and now CPU usage is around 15% to 20%!

Great job and many thanks for the work!

FYI, dependencies:
Adafruit-PlatformDetect-2.11.1 Adafruit-PureIO-1.1.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

100% CPU use of libgpiod_pulsein on Raspberry Pi

10 participants