Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved logging of hwmon sensor errors in AsynchronousMetrics #27031

Merged
merged 2 commits into from
Aug 10, 2021

Conversation

excitoon
Copy link
Contributor

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Before, each second:

2021.07.29 20:48:41.006906 [ 156 ] {} <Error> void DB::AsynchronousMetrics::update(std::chrono::system_clock::time_point): Code: 74. DB::ErrnoException: Cannot read from file /sys/class/hwmon/hwmon3/temp1_input, errno: 22, strerror: Invalid argument. (CANNOT_READ_FROM_FILE_DESCRIPTOR), Stack trace (when copying this message, always include the lines below):

0. Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x12663a6c in /usr/bin/clickhouse
1. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x926c0fa in /usr/bin/clickhouse
2. DB::throwFromErrnoWithPath(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, int) @ 0x926cd50 in /usr/bin/clickhouse
3. DB::ReadBufferFromFileDescriptor::nextImpl() @ 0x92ae6cd in /usr/bin/clickhouse
4. void DB::readIntTextImpl<long, void, (DB::ReadIntTextCheckOverflow)0>(long&, DB::ReadBuffer&) @ 0x93c9072 in /usr/bin/clickhouse
5. DB::AsynchronousMetrics::update(std::__1::chrono::time_point<std::__1::chrono::system_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000l> > >) @ 0xf739d48 in /usr/bin/clickhouse
6. DB::AsynchronousMetrics::run() @ 0xf73edaa in /usr/bin/clickhouse
7. void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<ThreadFromGlobalPool::ThreadFromGlobalPool<DB::AsynchronousMetrics::start()::$_0>(DB::AsynchronousMetrics::start()::$_0&&)::'lambda'(), void ()> >(std::__1::__function::__policy_storage const*) @ 0xf740774 in /usr/bin/clickhouse
8. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x929ff0b in /usr/bin/clickhouse
9. void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()> >(void*) @ 0x92a2033 in /usr/bin/clickhouse
10. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
11. __clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
 (version 21.9.1.1)

After:

2021.07.30 17:38:14.047710 [ 7 ] {} <Debug> AsynchronousMetrics: Hardware monitor 'nouveau', sensor '' exists but could not be read, error 22.

@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jul 30, 2021
@kitaisreal kitaisreal self-assigned this Jul 30, 2021
}
catch (const ErrnoException & e)
{
LOG_DEBUG(&Poco::Logger::get("AsynchronousMetrics"), "Hardware monitor '{}', sensor '{}' exists but could not be read, error {}.", hwmon_name, sensor_name, e.getErrno());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@excitoon should not we rethrow exception ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess no. Why would we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that before if readText throws exception we did not catch that and we pass it to high level code.
Should not we do that now ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-not-for-changelog This PR should not be mentioned in the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants