Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-24686 [LOG] Log improvement in Connection#close #2219

Merged
merged 1 commit into from
Aug 21, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,10 @@ public void close() {
if (!closed.compareAndSet(false, true)) {
return;
}
LOG.info("Connection has been closed by {}.", Thread.currentThread().getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also be at debug level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is helpful to identify who closed the connection improperly in production env.
Level debug is inconvenient since we have to re-run the user application to reproduce it. Meanwhile, it should not be called frequently.
Maybe I did not get your concern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to keep this at INFO. Connection objects are supposed to be long lived so we should see this message rarely. Ideally once per cluster per JVM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean to correlate with no of occurrences of this log, I meant to say user who is interested to know which thread closed the connection would also likely want to check out call stacktrace. So if call stack is at debug level, let's also keep thread name at debug level.

But yeah, not everyone enables debug log for production use-case and hence, maybe we should keep both at info level? Stacktrace is also equally interesting right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for lack of clarification in previous comment. Also, changes look good, just that once we have both at info level, we should be good to go ahead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, how about first we gather stack trace and then log both threadname and stacktrace together in same log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the stacktrace is an option if the thread name is helpless. It's almost enough with meaningful name. I'm also worried log stack at info level is confusing since the format is like an exception.

Copy link
Contributor

@virajjasani virajjasani Aug 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also worried log stack at info level is confusing since the format is like an exception.

Even though it is Exception, but we do want to log which Exception caused Connection closure (once in a while) right? If so, then keeping threadname at info level and stacktrace at debug somehow doesn't fit well IMHO.
If we know for certain that threadname is useful to get sufficient info, then we should not log stacktrace at all, and if stacktrace is required for debugging, then it should be at the same level as threadname.

Level debug is inconvenient since we have to re-run the user application to reproduce it.

This is exactly applicable to stacktrace also. If application is running with info level and if threadname is not sufficient to know the cause, then we can't get stacktrace either because our application was running with info level at that time and stacktrace is present at debug level in code. ¯\_(ツ)_/¯

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For background, I left a longer comment awhile ago on HBASE-24686 where I asked for some indication of the close at info and the stack trace at debug.

we already have other things that get logged at INFO when a connection is closing, but it's incidental and could happen in other ways. so an INFO message that clearly demarcates connection close is a thing I really want from this change. Enough so that I am sure I'll be back requesting it again the very next time I run into a production problem where I don't see the message, because almost no one I support runs DEBUG in production as a matter of course. Most of the times where this log message would have helped me in the last ~6 months the thread name and an unambiguous "we closed the connection" would suffice.

the stack trace would help with a specific problem we hit in a dev set up with an apparent bug in the hmaster. It did not happen often but it did happen regularly, so changing the log level to DEBUG to see a stack trace would have worked fine for us.

If you feel strongly about not having the messages at different log levels, @virajjasani, I'd rather see both at INFO then both at DEBUG, even though I generally am against non-error related stack traces at log levels above DEBUG.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the times where this log message would have helped me in the last ~6 months the thread name and an unambiguous "we closed the connection" would suffice.

That's perfectly fine, if thread name is enough for prod env and if we have a way to repro in dev env, we can let stacktrace be at debug. The only thing I was concerned about was whether not finding stacktrace in prod env would still keep things unclear for user. However, here I believe once we get the thread name, we can find recent logs of that thread to find more.

I generally am against non-error related stack traces at log levels above DEBUG.

+1 for this unless the non-error trace is useful for detailed debugging, in which case maybe we can write something specifically to indicate to not treat this as erroneous condition.
However, for our current PR, it seems thread-name is most likely going to provide sufficient info.

if(LOG.isDebugEnabled()){
logCallStack(Thread.currentThread().getStackTrace());
}
IOUtils.closeQuietly(clusterStatusListener);
IOUtils.closeQuietly(rpcClient);
IOUtils.closeQuietly(registry);
Expand All @@ -212,6 +216,16 @@ public void close() {
}
}

private void logCallStack(StackTraceElement[] stackTraceElements) {
StringBuilder stackBuilder = new StringBuilder("Call stack:");
for (StackTraceElement element : stackTraceElements) {
stackBuilder.append("\n at ");
stackBuilder.append(element);
}
stackBuilder.append("\n");
LOG.debug(stackBuilder.toString());
}

@Override
public AsyncTableRegionLocator getRegionLocator(TableName tableName) {
return new AsyncTableRegionLocatorImpl(tableName, this);
Expand Down