-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachpb: use EncodedError in Error #56603
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re the several initial commits:
if
Error
hasEqual()
codegen'ed for it, it needs to be present
onEncodedError
as well, which it isn't.
It should! there's no obstacle against it. I would rather go that route and trim these commits. reflect.DeepEqual is fraught with peril.
Regarding the rest: I see you are keeping the roachpb.Error
a protobuf message. This blocks you from embedding a Go error
field inside. This also prevents you from implementing the Cause()
/ Unwrap()
interface. That makes me sad somewhat. Is this really what we want?
I could see a world where the roachpb.Error
message is renamed roachpb.ErrorProto
, and roachpb.Error
becomes a regular go struct, which embeds its details as an error
, and only translate to/from roachpb.ErrorProto
at RPC boundaries. What do you think?
Reviewed 7 of 7 files at r1, 3 of 3 files at r2, 2 of 2 files at r3, 2 of 2 files at r4, 14 of 14 files at r5, 6 of 6 files at r6, 9 of 9 files at r7, 1 of 1 files at r8, 16 of 17 files at r9.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/ccl/backupccl/backup_processor.go, line 227 at r9 (raw file):
rawRes, pErr := kv.SendWrappedWith(ctx, flowCtx.Cfg.DB.NonTransactionalSender(), header, req) if pErr != nil { if _, ok := pErr.GetDetail().(*roachpb.WriteIntentError); ok {
errors.As
?
pkg/kv/kvclient/kvcoord/txn_coord_sender.go, line 611 at r9 (raw file):
ctx, abortedErr, roachpb.NormalUserPriority, tc.clock) return roachpb.NewError(roachpb.NewTransactionRetryWithProtoRefreshError( abortedErr.String(), tc.mu.txn.ID, newTxn))
are we comfortable embedding the string, and not the full error object?
pkg/roachpb/errors.go, line 334 at r9 (raw file):
if e.EncodedError != (errorspb.EncodedError{}) { err := errors.DecodeError(context.Background(), e.EncodedError) if iface, ok := err.(transactionRestartError); ok {
errors.As
?
pkg/sql/execinfra/processorsbase.go, line 670 at r9 (raw file):
// TransactionRetryWithProtoRefreshErrors) don't have any uncertainty. if ure := (*roachpb.UnhandledRetryableError)(nil); errors.As(err, &ure) { if _, uncertain := ure.PErr.GetDetail().(*roachpb.ReadWithinUncertaintyIntervalError); uncertain {
errors.As
?
cb09a7b
to
ad2196b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should! there's no obstacle against it. I would rather go that route and trim these commits. reflect.DeepEqual is fraught with peril.
I don't mind adding gogoproto.equal
to errors
, but I still want to remove all of this useless codegen. It actually makes me uneasy for quite some of the structs that someone should want to compare them. If a struct legit needs to be compared - sure, no problem, but I haven't found that to be true for any of the ones removed in that PR.
Reviewed 6 of 7 files at r1, 1 of 3 files at r2, 2 of 2 files at r4, 10 of 14 files at r5, 4 of 6 files at r6, 9 of 9 files at r7, 1 of 1 files at r8, 17 of 17 files at r9.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)
pkg/ccl/backupccl/backup_processor.go, line 227 at r9 (raw file):
Previously, knz (kena) wrote…
errors.As
?
I want to do a full pass against GetDetail()
to make it return an error
(instead of an ErrorDetail
). Then your error linter will kick in and I'll fix the 5000 other instances of this that aren't captured in this diff. Not in this PR though.
pkg/kv/kvclient/kvcoord/txn_coord_sender.go, line 611 at r9 (raw file):
Previously, knz (kena) wrote…
are we comfortable embedding the string, and not the full error object?
What do you mean? The message is identical to String()
at the time of writing. I don't like that we're stuffing it in that way, but it's the old behavior and I don't want to solve all of the paper cuts in this PR. Note that I left a TODO on NewTransactionRetryWithProtoRefreshError about this, curious what you think of it.
pkg/roachpb/errors.go, line 334 at r9 (raw file):
Previously, knz (kena) wrote…
errors.As
?
Done.
pkg/sql/execinfra/processorsbase.go, line 670 at r9 (raw file):
Previously, knz (kena) wrote…
errors.As
?
See other comment, keeping it mechanical for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @tbg)
pkg/kv/kvserver/store_send.go, line 247 at r10 (raw file):
} t.AppendRangeInfo(ctx, desc, l) // We have to write `t` back to `pErr` so that it picks up the changes.
This was made obvious to me through a test failure. I will need to run a full audit (yikes) before merging.
Essentially, we were mutating something returned from GetDetail()
and expecting that to affect the backing pErr
. But now with EncodedError
GetDetail()
basically makes a fresh copy for you. Which - maybe that is bad and we'll want to look into it for perf reasons, but I still think that pattern remains bad and we need to be explicit about mutating the original pErr
through it and not through some handle into its innards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't mind adding gogoproto.equal to errors, but I still want to remove all of this useless codegen
Discussed offline - yes, agreed. However I think we should then have some rule that reflect.DeepEqual should not be used throughout production code, just in tests.
LGTM otherwise - modulo completion of your investigation about interior mutability of the error detail.
Reviewed 5 of 5 files at r10.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/kv/kvclient/kvcoord/txn_coord_sender.go, line 611 at r9 (raw file):
Previously, tbg (Tobias Grieger) wrote…
What do you mean? The message is identical to
String()
at the time of writing. I don't like that we're stuffing it in that way, but it's the old behavior and I don't want to solve all of the paper cuts in this PR. Note that I left a TODO on NewTransactionRetryWithProtoRefreshError about this, curious what you think of it.
Understood thanks.
pkg/kv/kvserver/store_send.go, line 247 at r10 (raw file):
I still think that pattern remains bad and we need to be explicit about mutating the original pErr through it and not through some handle into its innards.
Agreed. Thanks for figuring it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline - yes, agreed. However I think we should then have some rule that reflect.DeepEqual should not be used throughout production code, just in tests.
I am not opposed to that, though it's somewhat orthogonal here. Even if .Equal
is defined, someone might use DeepEqual
in production.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)
pkg/kv/kvserver/store_send.go, line 247 at r10 (raw file):
Previously, knz (kena) wrote…
I still think that pattern remains bad and we need to be explicit about mutating the original pErr through it and not through some handle into its innards.
Agreed. Thanks for figuring it out.
I did do the audit (at least for the ~40 or so prod callers) and this was the one bad apple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not opposed to that, though it's somewhat orthogonal here
ok works for me!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the rest: I see you are keeping the roachpb.Error a protobuf message. This blocks you from embedding a Go error field inside. This also prevents you from implementing the Cause() / Unwrap() interface. That makes me sad somewhat. Is this really what we want?
I haven't thought this far ahead, but generally speaking I would like that. A separation at the RPC boundaries makes sense and by having a bare struct everywhere else we can avoid the repeated calls to DecodeError
that the current implementation incurs. I'll save it for another PR, though. Also, I might be missing something, but I don't see why we can't implement Cause
and Unwrap
as is.
I've also started thinking about pulling the protos apart a little bit. Ideally, you would have a kvpb
package that has all the possible structured errors that kv
can emit. I think there is a big migration snag here though, in that the current errors would need to be kept in two locations for the time being, which I don't love. Given that and the limited mandate for making something happen here, I'll probably stop short of that, especially since a lot of actual code cleanup remains to be done. But I wanted to dream at least for a second!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)
ad2196b
to
23f3e27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why we can't implement Cause and Unwrap as is.
The only way to implement them in your current framework would be to re-instantiate an error
every time they are called, via Decode
. This is rather expensive - these functions are called all over the place in error predicates.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @tbg)
067254f
to
740cd22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 14 files at r11.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @pbardea and @tbg)
740cd22
to
aec0059
Compare
`roachpb.Error` has long been poor to work with. It could only transport structured errors for a hard-coded enum message (`ErrorDetail`) and had a complex and circuituous interaction with the standard library's `error` type: `pErr.GoError()` would return an `error` that was sometimes just `pErr` itself (hidden behind an identical type); other times it was an `UnhandledRetryableError` wrapping `pErr`, and then of course sometimes it would be the structured error from which `pErr` was created in the first place. Thanks to `cockroachdb/errors`, there's a much better option around, namely `errorspb.EncodedError` which has powerful capabilities to essentially carry any error across RPC boundaries (and notably does so "out of the box" for protobuf-backed errors) while retaining its original structure, as long as the error type is known on both sides of the RPC. The simplifications that result from this primitive would be enough to motivate this change in itself, but additionally after this change we will be able to reliably detect context cancellations, etc, without relying on error-prone and insecure string matching. Going into more detail, this commit makes the following changes: - adds an `EncodedError` to `Error` - deprecates the following fields on `Error`, which are henceforth deduced from the `EncodedError` where present: - `transaction_restart` - `message` - `detail` These fields are populated for the time being so that 20.2 nodes understand the errors emitted by 21.1 nodes. In 21.2, we can drop them all. - deprecates `(*Error).SetDetail`, which has only one real caller anyway (and that caller will be able to replace EncodedError instead once 21.2 comes around). - makes `(*Error).SafeFormat` redact properly based on `EncodedError` (yay!), though there's a larger TODO that's left for a follow-up change (around the fact that error details can depend on the surrounding Error for rendering, particularly relevant when SetTxn is called) We can capitalize on EncodedError before the 21.2 cycle, but this is left for future PRs. Fixes cockroachdb#54939. Release note: None
aec0059
to
93ce5d0
Compare
bors r=knz |
Build failed (retrying...): |
Build succeeded: |
roachpb: use EncodedError in Error
roachpb.Error
has long been poor to work with. It could only transportstructured errors for a hard-coded enum message (
ErrorDetail
) and hada complex and circuituous interaction with the standard library's
error
type:pErr.GoError()
would return anerror
that wassometimes just
pErr
itself (hidden behind an identical type); othertimes it was an
UnhandledRetryableError
wrappingpErr
, and thenof course sometimes it would be the structured error from which
pErr
was created in the first place.
Thanks to
cockroachdb/errors
, there's a much better option around,namely
errorspb.EncodedError
which has powerful capabilities toessentially carry any error across RPC boundaries (and notably does so
"out of the box" for protobuf-backed errors) while retaining its
original structure, as long as the error type is known on both sides of
the RPC.
The simplifications that result from this primitive would be enough to
motivate this change in itself, but additionally after this change we
will be able to reliably detect context cancellations, etc, without
relying on error-prone and insecure string matching.
Going into more detail, this commit makes the following changes:
EncodedError
toError
Error
, which are henceforthdeduced from the
EncodedError
where present:transaction_restart
message
detail
These fields are populated for the time being so that 20.2 nodes
understand the errors emitted by 21.1 nodes. In 21.2, we can drop
them all.
(*Error).SetDetail
, which has only one real calleranyway (and that caller will be able to replace EncodedError instead
once 21.2 comes around).
(*Error).SafeFormat
redact properly based onEncodedError
(yay!), though there's a larger TODO that's left for a follow-up
change (around the fact that error details can depend on the
surrounding Error for rendering, particularly relevant when SetTxn
is called)
We can capitalize on EncodedError before the 21.2 cycle, but this
is left for future PRs.
Fixes #54939.
Release note: None