Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: improve error message for rare lease errors #64080

Merged
merged 1 commit into from
Apr 30, 2021

Conversation

andreimatei
Copy link
Contributor

In some rare cases, the status of a lease in relation to a
request/timestamp can't be determined. For the request's client this
results in a NotLeaseholderError. This patch improves the message of
that error.

In particular, this test failure[1] seems to show that a node couldn't
verify that an existing lease is expired because its liveness gossiped
info was stale. This sounds interesting and if the test fails again this
improved message should help.

[1] #57932 (comment)

Release note: None

@andreimatei andreimatei requested review from knz and tbg April 22, 2021 16:43
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @tbg)

In some rare cases, the status of a lease in relation to a
request/timestamp can't be determined. For the request's client this
results in a NotLeaseholderError. This patch improves the message of
that error.

In particular, this test failure[1] seems to show that a node couldn't
verify that an existing lease is expired because its liveness gossiped
info was stale. This sounds interesting and if the test fails again this
improved message should help.

[1] cockroachdb#57932 (comment)

Release note: None
return nil, false, roachpb.NewError(
newNotLeaseHolderError(roachpb.Lease{}, r.store.StoreID(), r.mu.state.Desc,
"lease state couldn't be determined"))
newNotLeaseHolderError(roachpb.Lease{}, r.store.StoreID(), r.mu.state.Desc, msg))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: probably no need to improve this since @knz thought what you have was fine, but if instead of a string you attached an EncodedError, wouldn't you retain proper redactability across the RPC and wouldn't it also feel "right" to attach an error to a lease with state ERROR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Tobias - I approved this because I knew that Andrei's main thrust was to remove the redaction markers and the PR achieves that, but moving to a structured error would be a nice improvement on top of that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽 In the interest of keeping things snappy I would merge as-is and we'll keep this in mind for higher-value use cases

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz and @tbg)


pkg/kv/kvserver/replica_range_lease.go, line 1111 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

👍🏽 In the interest of keeping things snappy I would merge as-is and we'll keep this in mind for higher-value use cases

Well I would like to learn what the right thing is; it's not like this PR is particularly urgent. But I'm not entirely sure what the suggestion is. You're saying turn NotLeaseholderError.CustomMsg into an EncodedError (say, a cause field)? And then what would I do such that pErr.String() reaches intopErr.EncodedErr.(*NotLeaseholderError).Cause? I would implement make NotLeaseholderError implement SafeFormatter and rely on

s.Print(err)
?

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @tbg)


pkg/kv/kvserver/replica_range_lease.go, line 1111 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Well I would like to learn what the right thing is; it's not like this PR is particularly urgent. But I'm not entirely sure what the suggestion is. You're saying turn NotLeaseholderError.CustomMsg into an EncodedError (say, a cause field)? And then what would I do such that pErr.String() reaches intopErr.EncodedErr.(*NotLeaseholderError).Cause? I would implement make NotLeaseholderError implement SafeFormatter and rely on

s.Print(err)
?

yes you could do that, to follow the current pattern.

I wouldn't call it "the right thing" though (see separate discussion on slack)

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz and @tbg)


pkg/kv/kvserver/replica_range_lease.go, line 1111 at r1 (raw file):

Previously, knz (kena) wrote…

yes you could do that, to follow the current pattern.

I wouldn't call it "the right thing" though (see separate discussion on slack)

yeah... My enthusiasm diminished a little bit; I'm merging as is.

@craig
Copy link
Contributor

craig bot commented Apr 30, 2021

Build succeeded:

@craig craig bot merged commit eaeaa7f into cockroachdb:master Apr 30, 2021
@andreimatei andreimatei deleted the small.lease-error-msg branch January 20, 2022 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants