Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't return cached heartbeat read when query service is down to avoi… #4689

Merged
merged 1 commit into from
Mar 12, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions go/vt/vttablet/heartbeat/reader.go
Original file line number Diff line number Diff line change
Expand Up @@ -227,3 +227,7 @@ func (r *Reader) recordError(err error) {
r.errorLog.Errorf("%v", err)
readErrors.Add(1)
}

func (r *Reader) IsOpen() bool {
return r.isOpen
}
22 changes: 15 additions & 7 deletions go/vt/vttablet/tabletserver/tabletserver.go
Original file line number Diff line number Diff line change
Expand Up @@ -152,12 +152,12 @@ type TabletServer struct {
// for health checks. This does not affect how queries are served.
// target specifies the primary target type, and also allow specifies
// secondary types that should be additionally allowed.
mu sync.Mutex
state int64
lameduck sync2.AtomicInt32
target querypb.Target
alsoAllow []topodatapb.TabletType
requests sync.WaitGroup
mu sync.Mutex
state int64
lameduck sync2.AtomicInt32
target querypb.Target
alsoAllow []topodatapb.TabletType
requests sync.WaitGroup

// The following variables should be initialized only once
// before starting the tabletserver.
Expand Down Expand Up @@ -229,7 +229,7 @@ type TxPoolController interface {
AcceptReadOnly() error

// InitDBConfig must be called before Init.
InitDBConfig(dbcfgs *dbconfigs.DBConfigs)
InitDBConfig(dbcfgs *dbconfigs.DBConfigs)

// Init must be called once when vttablet starts for setting
// up the metadata tables.
Expand Down Expand Up @@ -1867,6 +1867,14 @@ func (tsv *TabletServer) BroadcastHealth(terTimestamp int64, stats *querypb.Real
// HeartbeatLag returns the current lag as calculated by the heartbeat
// package, if heartbeat is enabled. Otherwise returns 0.
func (tsv *TabletServer) HeartbeatLag() (time.Duration, error) {
// If the reader is closed and we are not serving, then the
// query service is shutdown and this value is not being updated.
// We return healthy from this as a signal to the healtcheck to attempt
// to start the query service again. If the query service fails to start
// with an error, then that error is be reported by the healthcheck.
if !tsv.hr.IsOpen() && !tsv.IsServing() {
return 0, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to return something to indicate the reader is closed?

i.e. -1 or max int

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could return -1, if we return max int, then we will get an unhealthy due to replication lag issue. We could also return a custom error that specifies the reader is closed. We would need to add code to handle -1 or the custom error inside the healthcheck.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is an improvement already, I'll merge this. We can fix forward as needed.

}
return tsv.hr.GetLatest()
}

Expand Down