For discussion: rework error plumbing when proto.Call is involved #1403

tamird · 2015-06-17T18:59:24Z

We currently have an inconsistent error handling story where proto.Response is concerned (usually via proto.Call). Some grep of all the functions which handle these arguments (excluding tests) is below. Note that many of these have an explicit error return.

$ git grep -E 'proto\.(Call|Response)' | grep func | grep -v _test
client/batch.go:func (b *Batch) InternalAddCall(call proto.Call) {
client/db.go:func (db *DB) send(calls ...proto.Call) (err error) {
client/http_sender.go:func (s *httpSender) Send(_ context.Context, call proto.Call) {
client/http_sender.go:func (s *httpSender) post(call proto.Call) error {
client/rpc/rpc_sender.go:func (s *Sender) Send(_ context.Context, call proto.Call) {
client/sender.go:type SenderFunc func(context.Context, proto.Call)
client/sender.go:func (f SenderFunc) Send(ctx context.Context, c proto.Call) {
client/txn.go:func (ts *txnSender) Send(ctx context.Context, call proto.Call) {
client/txn.go:func (txn *Txn) send(calls ...proto.Call) error {
client/txn.go:func (txn *Txn) updateState(calls []proto.Call) {
kv/db.go:func createArgsAndReply(method string) (proto.Request, proto.Response) {
kv/db.go:func (s *rpcDBServer) executeCmd(args proto.Request, reply proto.Response) error {
kv/dist_sender.go:func (ds *DistSender) sendAttempt(desc *proto.RangeDescriptor, call proto.Call) (retry.Status, error) {
kv/dist_sender.go:func (ds *DistSender) getDescriptors(call proto.Call) (*proto.RangeDescriptor, *proto.RangeDescriptor, error) {
kv/dist_sender.go:func (ds *DistSender) Send(_ context.Context, call proto.Call) {
kv/local_sender.go:func (ls *LocalSender) Send(ctx context.Context, call proto.Call) {
kv/txn_coord_sender.go:func (tc *TxnCoordSender) Send(_ context.Context, call proto.Call) {
kv/txn_coord_sender.go:func (tc *TxnCoordSender) sendOne(call proto.Call) {
kv/txn_coord_sender.go:func (tc *TxnCoordSender) updateResponseTxn(argsHeader *proto.RequestHeader, replyHeader *proto.ResponseHeader) {
server/node.go:func (n *nodeServer) executeCmd(args proto.Request, reply proto.Response) error {
server/status/feed.go:func (nef NodeEventFeed) CallComplete(args proto.Request, reply proto.Response) {
storage/range.go:var TestingCommandFilter func(proto.Request, proto.Response) bool
storage/range.go:func (r *Range) AddCmd(ctx context.Context, call proto.Call, wait bool) error {
storage/range.go:func (r *Range) addAdminCmd(ctx context.Context, args proto.Request, reply proto.Response) error {
storage/range.go:func (r *Range) addReadOnlyCmd(ctx context.Context, args proto.Request, reply proto.Response) error {
storage/range.go:func (r *Range) addWriteCmd(ctx context.Context, args proto.Request, reply proto.Response, wait bool) error {
storage/range.go:func (r *Range) proposeRaftCommand(ctx context.Context, args proto.Request, reply proto.Response) (<-chan error, *pendingCmd) {
storage/range.go:func (r *Range) applyRaftCommand(ctx context.Context, index uint64, originNodeID proto.RaftNodeID, args proto.Request, reply proto.Response) error {
storage/range_command.go:func (r *Range) executeCmd(batch engine.Engine, ms *proto.MVCCStats, args proto.Request, reply proto.Response) ([]proto.Intent, error) {
storage/response_cache.go:func (rc *ResponseCache) GetResponse(cmdID proto.ClientCmdID, reply proto.Response) (bool, error) {
storage/response_cache.go:func (rc *ResponseCache) PutResponse(cmdID proto.ClientCmdID, reply proto.Response) error {
storage/response_cache.go:func (rc *ResponseCache) shouldCacheResponse(reply proto.Response) bool {
storage/store.go:func (s *Store) ExecuteCmd(ctx context.Context, call proto.Call) error {

Some methods return the error outright, some set it on the proto.Response directly, and some do a mix of the two. We should resolve this inconsistency because it leads to confusing code and a very high probability of subtle bugs. There are two approaches we could take:

standardize on `ResponseHeader.SetGoError()`

This requires that we remove error returns from all methods that operate on proto.Responses. In addition, it will mean that we will need to remember to check .GoError() each time we call such a method. This is fragile, and we will have no tool support.

standardize on error returns

This requires that we create a new structure to be obtained by unwrapping the proto.Response which will not be able to carry an error. All methods which currently operate on proto.Response (with the exception of those that live at process boundaries) will need to change to return explicit errors rather than attaching errors directly to their proto.Response argument.

This option is my preference. It provides for tooling support via errcheck and is more idiomatic. That being said, the floor is open. Please 🚲 🏡 !

The text was updated successfully, but these errors were encountered:

tbg · 2015-06-17T19:15:55Z

I don't think the first option is really there because it means that any error must be a proto error, on top of its other drawbacks.

In the second option, proto.Response will still carry an error (after all, the client needs to get an error sometimes) but we can make it a point to only set it in one (or a few) consistent places just before we're done with the request and I think it's a good idea to try to streamline error handling that way.

bdarnell · 2015-06-17T19:24:24Z

What do you mean by "unwrapping the proto.Response"? Currently a Response contains the response header (and error).

One other consideration is batches: in a batch, each response carries its own error (and the batch has a whole has an error which is defined to be the first non-nil error in the batch). This is more difficult to express with returned errors than with errors in the header (although maybe batches can be treated as something like a transport boundary)

tamird · 2015-06-17T19:28:20Z

What do you mean by "unwrapping the proto.Response"? Currently a Response contains the response header (and error).

I meant that the proto.Response is the wrapper, so we unwrap it to recover the headerless message, operate on that within the process, and then write the error to the header at the process boundary.

One other consideration is batches: in a batch, each response carries its own error (and the batch has a whole has an error which is defined to be the first non-nil error in the batch). This is more difficult to express with returned errors than with errors in the header (although maybe batches can be treated as something like a transport boundary)

I think this is orthogonal. We would treat each request/response in the batch as described, and do the batch special magic in a central place (process boundary anyway).

bdarnell · 2015-06-17T19:30:57Z

But proto.Response is not currently a wrapper; it's an interface implemented by all the concrete response type. We don't currently have a type corresponding to the headerless message. Are you proposing a refactoring where proto.Response is no longer an interface but a concrete message containing a ResponseHeader and a byte array (representing an encoded headerless response of an unspecified type)?

tamird · 2015-06-17T19:33:50Z

The end result should be that we can decompose the equivalent of today's proto.Response to the header and the "rest". I'm not ready to suggest a specific implementation yet.

tbg · 2015-06-17T19:46:15Z

Could simply have split Response into interfaces ResponseWithoutHeader and Header (depending on what's needed, and named more aptly) and pass that down to calls. Sure, one could type-assert back but who would.

bdarnell · 2015-06-17T19:50:03Z

ResponseWithoutHeader is just a gogoproto.Message which is pretty useless on its own; type-asserting it to a concrete type (PutResponse, etc) is the common case.

tamird · 2015-06-24T11:41:34Z

@bdarnell I think the point of ResponseWithoutHeader is to be a "marker" interface, so that the list of implementers is suitably short. We can probably just do this by requiring a Method method.

bdarnell · 2015-06-24T13:52:12Z

Yes, but how does that marker interface help? Any method that takes a ResponseWithoutHeader is going to convert it to a concrete type like PutResponse which has no protections against setting the error in the header.

tamird · 2015-06-24T13:53:23Z

I was thinking that Response would become a concrete struct containing ResponseWithoutHeader and Header, where each of {Get,Put,...}Response only implement ResponseWithoutHeader.

bdarnell · 2015-06-24T13:58:24Z

So you would change the protos themselves so that PutResponse and friends no longer contain a header field?

tamird · 2015-06-24T14:01:03Z

Yeah, I think that's the only way to get the properties described here

vivekmenezes · 2015-06-24T14:29:10Z

I think the proposal is to have Response contain a Header, and a oneof with all the different types, where all the different types themselves have a KVHeader that contains the keys, etc and not the error. That's a pretty big change.

Another idea is to just enforce that error is not set at the boundary in our code. Not enforced by the language but an easier change to make.

tamird · 2015-07-30T14:07:57Z

Update on this: the only remaining non-conformances are Sender.Send and most of the Batch API. On the Sender side there is an open question: what do we want the return values to be? We may inherently have both transmission and application errors arising from these methods; do we want to distinguish the two? If an error occurs, do we still want the reply returned?

$ git grep -F 'proto.Call' | grep -F func | grep -vF _test
client/batch.go:func (b *Batch) InternalAddCall(call proto.Call) {
client/db.go:func (db *DB) send(calls ...proto.Call) (err error) {
client/http_sender.go:func (s *httpSender) Send(_ context.Context, call proto.Call) {
client/rpc/rpc_sender.go:func (s *Sender) Send(_ context.Context, call proto.Call) {
client/sender.go:type SenderFunc func(context.Context, proto.Call)
client/sender.go:func (f SenderFunc) Send(ctx context.Context, c proto.Call) {
client/txn.go:func (ts *txnSender) Send(ctx context.Context, call proto.Call) {
client/txn.go:func endTxnCall(commit bool) proto.Call {
client/txn.go:func (txn *Txn) send(calls ...proto.Call) error {
client/txn.go:func (txn *Txn) updateState(calls []proto.Call) {
kv/dist_sender.go:func (ds *DistSender) getDescriptors(call proto.Call) (*proto.RangeDescriptor, *proto.RangeDescriptor, error) {
kv/dist_sender.go:func (ds *DistSender) Send(ctx context.Context, call proto.Call) {
kv/local_sender.go:func (ls *LocalSender) Send(ctx context.Context, call proto.Call) {
kv/txn_coord_sender.go:func (tc *TxnCoordSender) Send(ctx context.Context, call proto.Call) {
kv/txn_coord_sender.go:func (tc *TxnCoordSender) sendOne(ctx context.Context, call proto.Call) {

tamird · 2015-09-24T18:12:52Z

@tschottdorf we might want to extend this issue to the removal of errors from non-batch responses; what do you think?

tbg · 2015-09-24T20:06:55Z

Yes, I think that's a good idea.
On Sep 24, 2015 20:13, "Tamir Duberstein" notifications@github.com wrote:

@tschottdorf https://github.com/tschottdorf we might want to extend
this issue to the removal of errors from non-batch responses; what do you
think?

—
Reply to this email directly or view it on GitHub
#1403 (comment)
.

This was referenced Jun 17, 2015

Remove error return value from ExecuteCmd #1400

Closed

Add a SQLServer that returns http.StatusNotImplemented. #1476

Merged

tamird mentioned this issue Jul 10, 2015

Kill with fire: SetGoError #1661

Merged

tamird mentioned this issue Jul 21, 2015

Return (response, error) #5 #1750

Merged

tamird self-assigned this Jul 22, 2015

tamird added the C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. label Jul 22, 2015

tamird removed their assignment Sep 2, 2015

tamird assigned tbg Sep 24, 2015

tbg mentioned this issue Sep 29, 2015

own headers for BatchRe{quest,sponse} #2709

Merged

tbg closed this as completed in 2833a90 Sep 29, 2015

mrtracy mentioned this issue May 16, 2017

ui: Bring Logs page up to date. #13182

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For discussion: rework error plumbing when proto.Call is involved #1403

For discussion: rework error plumbing when proto.Call is involved #1403

tamird commented Jun 17, 2015

tbg commented Jun 17, 2015

bdarnell commented Jun 17, 2015

tamird commented Jun 17, 2015

bdarnell commented Jun 17, 2015

tamird commented Jun 17, 2015

tbg commented Jun 17, 2015

bdarnell commented Jun 17, 2015

tamird commented Jun 24, 2015

bdarnell commented Jun 24, 2015

tamird commented Jun 24, 2015

bdarnell commented Jun 24, 2015

tamird commented Jun 24, 2015

vivekmenezes commented Jun 24, 2015

tamird commented Jul 30, 2015

tamird commented Sep 24, 2015

tbg commented Sep 24, 2015

For discussion: rework error plumbing when proto.Call is involved #1403

For discussion: rework error plumbing when proto.Call is involved #1403

Comments

tamird commented Jun 17, 2015

standardize on ResponseHeader.SetGoError()

standardize on error returns

tbg commented Jun 17, 2015

bdarnell commented Jun 17, 2015

tamird commented Jun 17, 2015

bdarnell commented Jun 17, 2015

tamird commented Jun 17, 2015

tbg commented Jun 17, 2015

bdarnell commented Jun 17, 2015

tamird commented Jun 24, 2015

bdarnell commented Jun 24, 2015

tamird commented Jun 24, 2015

bdarnell commented Jun 24, 2015

tamird commented Jun 24, 2015

vivekmenezes commented Jun 24, 2015

tamird commented Jul 30, 2015

tamird commented Sep 24, 2015

tbg commented Sep 24, 2015

standardize on `ResponseHeader.SetGoError()`