-
Notifications
You must be signed in to change notification settings - Fork 594
Make metrics mgr fail fast when unexpected errors happen. #1473
Conversation
String sinkId = threadName; | ||
Integer thisSinkRetryAttempts = sinksRetryAttempts.remove(sinkId); | ||
if (exception instanceof Error || thread.getId() == mainThreadId) { | ||
LOG.severe("Would not recover from error. Metrics Manager halts immediately"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of logging the exception separately above, can we have an if/then with a different log message in each. That way it's clear what's causing what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but I don't understand. This implementation logs exception at the start once; and then followings are handling logic with a different if/then ... Those are not causes but handling logics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion was instead of doing this:
- log all the details of the exception
if it's the main thread
- log that we have to die (due to the previously logged exception)
to the log reader it will be more clear if we consolidate those two:
if it's the main thread
- log all the details of the exception and that we have to die
else
- log all the details of the exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Will do that.
Runtime.getRuntime().halt(1); | ||
} | ||
|
||
String sinkId = thread.getName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sinceThreadName? it's not the thread id, it's the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In current implementation, we enforce the thread-name the same as sink-id. Do you think we need to add an extra map to record it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't suggesting an extra map, just consistency in variable naming. Above we assign getId()
to a variable named mainThreadId
:
mainThreadId = Thread.currentThread().getId();
bit here we assign getName()
to a variable named sinkId
(instead of sinkName
):
sinkId = thread.getName();
this is a minor nit though and can be disregarded if you think sinkId
is more representative bases on it's usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments in the source code to avoid confusion. sinkId is more representative, since it is defined inside the metrics_sink.yaml
@maosongfu - the title says that it is the stream manager instead of metrics manager? Can you please correct? |
@maosongfu - is this completed? |
@kramasamy Updated. |
👍 |
* Make stream mgr fail fast when unexpected errors happen.
Make metrics mgr fail fast when unexpected errors happen to avoid dangling process.
This is a bug reported by Twitter internally.