Skip to content

Commit

Permalink
logger: Use a function-level static var for the logger object
Browse files Browse the repository at this point in the history
  • Loading branch information
rgacogne committed Jun 1, 2018
1 parent ce633f7 commit 976adf0
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 2 deletions.
15 changes: 14 additions & 1 deletion pdns/logger.cc
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,20 @@ extern StatBag S;
pthread_once_t Logger::s_once;
pthread_key_t Logger::g_loggerKey;

Logger g_log("", LOG_DAEMON);
Logger& getLogger()
{
/* Since the Logger can be called very early, we need to make sure
that the relevant parts are initialized no matter what, which is tricky
because we can't easily control the initialization order, especially with
built-in backends.
t_perThread is thread_local, so it will be initialized when first accessed,
but we need to make sure that the object itself is initialized, and making
it a function-level static variable achieves that, because it will be
initialized the first time we enter this function at the very last.
*/
static Logger log("", LOG_DAEMON);
return log;
}

void Logger::log(const string &msg, Urgency u)
{
Expand Down
4 changes: 3 additions & 1 deletion pdns/logger.hh
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,9 @@ private:
static pthread_key_t g_loggerKey;
};

extern Logger g_log;
Logger& getLogger();

#define g_log getLogger()

#ifdef VERBOSELOG
#define DLOG(x) x
Expand Down

6 comments on commit 976adf0

@tih
Copy link

@tih tih commented on 976adf0 Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the crash that was observed on NetBSD/amd64, when a new thread would SEGV when it first tried to log something. It also fixes the crash, in the same situation, that was observed on NetBSD/arm32, even with the previous attempts that kept the /amd64 version from crashing.

However, now that the recursor is able to log without crashing on /arm32, it crashes with a SEGV later:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  MemRecursorCache::replace (this=0x1, now=8043423180755413477, qname=..., 
    qt=..., content=..., signatures=..., authorityRecs=..., 
    auth=auth@entry=true, ednsmask=..., state=state@entry=Insecure)
    at recursor_cache.cc:241
241       d_cachecachevalid = false;
[Current thread is 1 (process 2)]
(gdb) bt
#0  MemRecursorCache::replace (this=0x1, now=8043423180755413477, qname=..., 
    qt=..., content=..., signatures=..., authorityRecs=..., 
    auth=auth@entry=true, ednsmask=..., state=state@entry=Insecure)
    at recursor_cache.cc:241
#1  0x04ba83ec in primeHints () at reczones.cc:61
#2  0x04b71c9c in recursorThread (n=<optimized out>, worker=<optimized out>)
    at pdns_recursor.cc:3429
#3  0x6ff79088 in ?? () from /usr/lib/libstdc++.so.8
#4  0x7009b8fc in ?? () from /usr/lib/libpthread.so.1
Register 25 is not available
(gdb) print d_cachecachevalid
Cannot access memory at address 0x85

@tih
Copy link

@tih tih commented on 976adf0 Jun 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The auth server seems to work fine with this patch. With a gpgsql backend, I can create and modify zones, secure them, and query them as expected. Nothing fails, logging works.

@tih
Copy link

@tih tih commented on 976adf0 Jun 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test program on http://en.cppreference.com/w/cpp/language/storage_duration works just fine on the Pi, though, which should indicate that thread local storage basically works...?

@tih
Copy link

@tih tih commented on 976adf0 Jun 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure that the "Cannot access memory at address 0x85" means that thread local storage is corrupt. I think it's gdb that doesn't know what it's doing, and 0x85 might be an offset within the (thread local) data area representing the cache object. Viz:

(gdb) up
#1  0x092683ec in primeHints () at reczones.cc:61
61            t_RC->replace(time(0), DNSName(templ), QType(QType::A), aset, vector<std::shared_ptr<RRSIGRecordContent>>(), vector<std::shared_ptr<DNSRecord>>(), true, boost::none, validationState); // auth, nuke it all
(gdb) print t_RC
Cannot find thread-local variables on this target
(gdb) print c
$1 = 97 'a'

Obviously, gdb simply doesn't handle thread local on arm, so it ought to have said the same above.

@tih
Copy link

@tih tih commented on 976adf0 Jun 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another interesting thing is that t_RC is declared as an uninitialized thread local pointer to a MemRecursorCache instance, and the first thing that's done with it is to check if it's false (by implication, null), in which case a new instance is created. Well, on NetBSD/arm32, compiled by GCC 6.4, it's not false right off the bat, so the 'new' doesn't happen. (Forcing that makes it run a bit further, but the program crashes with another segmentation violation shortly thereafter.) Whether the error is the non-zeroing of the variable, or the assumption that zeroing will happen, I do not know.

@tih
Copy link

@tih tih commented on 976adf0 Jun 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting back to where we started, though: this is clearly a good, simple fix for the original problem, which was crashes caused by attempting to log before the logger was properly initialized. :)

Please sign in to comment.