Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No data on one temperature sensor prevents reading any other temp sensor #665

Closed
1 of 5 tasks
marcv81 opened this issue Apr 9, 2019 · 9 comments
Closed
1 of 5 tasks

Comments

@marcv81
Copy link
Contributor

marcv81 commented Apr 9, 2019

Describe the bug

host_linux.go / SensorsTemperature() returns an error if any of the temperature sensors does not have any data available. It is however an acceptable state for a sensor and should not be treated as an error. Furthermore a single sensor with no data available should not prevent reading the temperatures from any of the other sensors.

Example of a sensor which functions correctly but does not provide any data under certain conditions: Intel Wireless LAN hardware and the iwlwifi module.

WiFi on:

$ cat /sys/class/hwmon/hwmon4/name 
iwlwifi
$ cat /sys/class/hwmon/hwmon4/temp1_input 
41000

WiFi off:

$ cat /sys/class/hwmon/hwmon4/temp1_input 
cat: /sys/class/hwmon/hwmon4/temp1_input: No data available

To Reproduce

Please see downstream bug: influxdata/telegraf#5692.

Expected behavior

When any of the temperature sensors does not have any data available, host_linux.go / SensorsTemperature() should still read and return the temperature values for the remaining sensors.

Environment (please complete the following information):

  • Windows: [paste the result of ver]
  • Linux: [paste contents of /etc/os-release and the result of uname -a]
  • Mac OS: [paste the result of sw_vers and uname -a
  • FreeBSD: [paste the result of freebsd-version -k -r -u and uname -a]
  • OpenBSD: [paste the result of uname -a]

/etc/os-release

NAME="Ubuntu"
VERSION="16.04.6 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.6 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

uname -a

Linux 5.0.7-050007-generic #201904052141 SMP Fri Apr 5 21:43:20 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Ubuntu 16.04 + Kernel 5.0.7.

@Lomanic
Copy link
Collaborator

Lomanic commented Apr 9, 2019

Quite similar problem as #572, but this time we may check a given explicit error and only bail out on it only on some cases.

@marcv81
Copy link
Contributor Author

marcv81 commented Apr 10, 2019

This case is indeed very similar to Windows PartitionsWithContext(). Unfortunately the other issue appears to have stalled.

Proposal

  • Add an Err field to TemperatureStat. This makes it trivial for downstream to iterate over []TemperatureStat and decide what to do for each sensor, possibly going as far as checking the exact error type and message.
  • Return nil as the second return value only if there was no error with any of the sensors. In case there are sensors errors we should return an error instructing downstream to check TemperatureStat.Err. If we wanted to be extremely conservative with respect to backward compatibility we could return the first sensor error encountered.

The proposal does not break anything downstream because errors are still returned under the same conditions. There only is a difference if downstream read TemperatureStat without checking for errors, but that was a downstream bug in the first place.

The proposal also allows downstream to easily take advantage of this bugfix whenever they are ready.

Thoughts?

@marcv81
Copy link
Contributor Author

marcv81 commented Apr 11, 2019

Proposal: #668

This is slightly different than my previous proposal and has an even lower impact. There is no change to the existing functions and data types. Two new functions are available for the downstream projects who care about the case when only some of the temperature sensors have data.

@marcv81
Copy link
Contributor Author

marcv81 commented Apr 25, 2019

New proposed fix: #675

@shirou
Copy link
Owner

shirou commented Apr 27, 2019

So your PR suggests introduce new Warning type which has error interface. It does not break compatibility and provide more information if a user wants.

I think it is very good solution for this problem. How do you think ? > @Lomanic

And I have another proposal (sorry not exactly related to this issue): the PR defines it in hosts but perhaps internal/common/common.go is better place because other platform can use it. However it is under internal so a user can not cast to Warning. It is better to introduce common package on top and move internal to there, IMO.

@marcv81
Copy link
Contributor Author

marcv81 commented Apr 27, 2019

Thanks for your comment and kind words @shirou.

I agree with you that Warnings would be better with a wider scope than host. However it is very important that a user can do type assertion on Warnings. So I would prefer to introduce a common package please.

Here is what I am trying to do downstream: https://github.com/marcv81/telegraf/commit/058a83d18eb025a50490b291c27953c4a9712d9d. For now I just ignore the warnings, later I might log them appropriately.

@marcv81
Copy link
Contributor Author

marcv81 commented Apr 29, 2019

Hi @Lomanic and @shirou! @Lomanic you have a good point that we should not export anything more unless absolutely required, otherwise maintenance will be harder.

I see a few ways forward without merging #678.

  1. Rename Warnings to TemperatureWarnings to make it explicit that this error type is not generic and should only be used in host. The is a duplication risk if we find we want to use Warnings elsewhere, but the code is short and unlikely to grow much.
  2. Drop the idea of a new error type. Return a new generic error with the message "Could not read N temperature sensors" instead. If len(temperatures) > 0 the user knows that all the readable sensors were read. We do lose information about the errors, but it is the only way to avoid adding anything to the API.
  3. Create a new common package for Warnings. I don't think it is a good idea because it is hard to find a good name: common is not specific enough and we already have internal/common so it might be confusing. A more specific name would require knowledge of what we may add in the future.

These idea are listed in order of preference. Do you have any comment? Or any better idea? Thanks for your help!

shirou added a commit that referenced this issue Jun 1, 2019
@shirou
Copy link
Owner

shirou commented Jun 1, 2019

#675 is merged. The correct place to Warning will be discuss another issue. Thank you for the issue!

@shirou shirou closed this as completed Jun 1, 2019
@marcv81
Copy link
Contributor Author

marcv81 commented Jun 1, 2019

Thanks for the pleasant discussion in this issue and the associated PR @shirou and @Lomanic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants