-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(server): reduce the screaming heartbeat logs #836
Conversation
@unihorn @xiangli-cmu |
@@ -97,6 +109,7 @@ func NewPeerServer(psConfig PeerServerConfig, client *Client, registry *Registry | |||
serverStats: serverStats, | |||
|
|||
timeoutThresholdChan: make(chan interface{}, 1), | |||
logBackoffs: make(map[string]*logBackoff), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go fmt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
Gah, 23cc856. Need a cup of tea before doing PRs. |
Good fix. |
@unihorn Fixed the unknown thing. |
} | ||
log.Infof("%s: warning: heartbeat timed out: '%v'", s.Config.Name, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philips Can we check the lastActivity here. If the lastActivity is smaller than the next backoff and greater than the previous backoff, we should reset the backoff. So we probably do not need to monitor routine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. That is a better idea.
On Jun 7, 2014 10:37 AM, "Xiang Li" notifications@github.com wrote:
In server/peer_server.go:
}
log.Infof("%s: warning: heartbeat timed out: '%v'", s.Config.Name, name)
@philips https://github.com/philips Can we check the lastActivity here.
If the lastActivity is smaller than the next backoff and greater than the
previous backoff, we should reset the backoff. So we probably do not need
to monitor routine?—
Reply to this email directly or view it on GitHub
https://github.com/coreos/etcd/pull/836/files#r13520506.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xiangli-cmu fixed in 56a0e6a can you review?
@philips lgtm if it works. |
Currently the only way we know that a peer isn't getting a heartbeat is an edge triggered event from go raft on every missed heartbeat. This means that we need to do some book keeping in order to do exponential backoff. The upside is that instead of screaming thousands of log lines before a machine hits the default removal of 30 minutes it is only ~100.
fix(server): reduce the screaming heartbeat logs
Currently the only way we know that a peer isn't getting a heartbeat is an
edge triggered event from go raft on every missed heartbeat. This means that
we need to do some book keeping in order to do exponential backoff.
The upside is that instead of screaming thousands of log lines before a
machine hits the default removal of 30 minutes it is only ~100.