Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consistent lookup vindex #4861

Merged
merged 12 commits into from
May 10, 2019
Merged

Conversation

sougou
Copy link
Contributor

@sougou sougou commented May 9, 2019

This implements #4855.

Design decisions

Beyond the high level approach described in the RFC. There are a few additional design decisions I had to make to get this feature working:

  • planbuilder and engine should be mostly unaffected: Since the planbuilder and engine are going to be growing in complexity, I didn't want them to take the burden of coordinating commit order and row checks. This required me to implement the entire logic within the consistent lookup vindex itself. In order to achieve this, I had to extend the VCursor API to allow the Vindex to perform things that were previously not possible. Specifically: Issuing Pre and Post transaction statements. And the ability to target a shard by its keyspace id (ExecuteKeyspaceID).
  • A Vindex can optionally implement the WantOwnerInfo interface. If it does the vschema builder will give it the info needed for it to verify if the main row is present.
  • Parallel commits: With the new lookup, there is actually no need to follow a commit order, which we previously did, in order to prefer committing the lookup rows first. However, some people still rely on this order, at least while they're using the older lookups. So, commits have been parallelized for pre and post transactions, but the main transaction retains its old behavior of committing in DML order. Once we get everyone migrated to the consistent lookup, we can change this behavior to be all parallel.
  • Removed the deprecated legacy mode support for transactions. No one uses this any more, and it was complicating the number of states to consider for this feature. Also simplified other flows within session to allow it to accommodate the commit order feature.
  • Fixed some bugs in partial DML detection. Previously, any statement within a transaction was marked as modifying data. This made us forcibly rollback transactions as partially executed even if it only performed reads. This was not interacting well with this feature. So, I fixed the behavior to fail only if actual partial DMLs were executed.

Implementation

Based on the above decisions, there are changes in the following areas:

  • Session: Clean up and extend to support commit ordering. Change the transaction commit function to honor this order.
  • VSchema: Define the WantOwnerInfo interface and have the builder set the owner info if a vindex implements it.
  • VCursor: Implement Pre, Post and ExecuteKeyspaceID functions. Also accurately detect partial DML execution.
  • Consistent Lookup Vindexes: Implementation of the actual vindex.
  • New endtoend test for VTGate using VTCombo: This will also act as framework for eventually porting the vtgatev3 python tests into this new framework.

sougou added 10 commits May 5, 2019 12:15
SafeSession and underlying session always non-nil.
Append makes session not autocommitable.
Remove legacy mode.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Added two new shard sessions for commit ordering:
pre and post.

Added API to set the commit order and changed
tx conn to honor it.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Add the ability to request the owner column names, which
will be used to check for presence of owner row by the vindex.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Some people may still rely on the sequential commit order
of the normal transactions. So, it's better not to parallelize
that part.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
@sougou sougou requested review from demmer, deepthi and rafael May 9, 2019 06:41
newSession.Autocommit = true
newSession.Warnings = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the warnings cleared here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used as a private session within the context of an outer session, specifically for vindex autocommit. The warnings of the outer one are untouched. After this execution finishes, we could consider appending any new warnings generated back to the outer session. But that may or may not be right. We should look at some real use cases before making this decision.

Maybe we should rename this to NewInnerSession?

Copy link
Member

@demmer demmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks sound to me.

A couple high level comments:

It would have been good to have separated out the legacyAutocommit deprecation and scattered bugfixes into a separate PR to ease the review and improve clarity.

Overall I am a bit concerned about over-promising magic by using the name "consistent_lookup". At the end of the day there's a tradeoff between potentially leaving dangling rows in the lookup table (like this one does) or potentially missing rows (like the other one does), but there's no perfect way to ensure consistency when we have best effort distributed transactions as the underlying primitive (ignoring 2PC of course).

So... maybe we should name this variant something like covering_lookup or super_lookup or something like that? Something to be explicit about how it might fail.

@@ -740,7 +740,9 @@ func (stc *ScatterConn) multiGo(
oneShard := func(rs *srvtopo.ResolvedShard, i int) {
var err error
startTime, statsKey := stc.startAction(name, rs.Target)
defer stc.endAction(startTime, allErrors, statsKey, &err, nil)
// Send a dummy session.
// TODO(sougou): plumb a real session through this call.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the implications here? I'm a bit confused about where the multiGo parts actually come into play?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously made a dumb decision that allowed a SafeSession to be nil, which required me to check for it on every method. It turns out that this is the only use case where the nil value is used. So, I changed it to send an empty session which means the same thing.

The comment is a reminder to look up the stack to see if the real session could to be plumbed through. However, I suspect that we'll just remove this code when we get rid of V2.

return err
}

// Retain backward compatibility on commit order for the normal session.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me we should be able to mark the session as to whether or not we can commit in parallel -- if it's only the legacy vindexes that care about the ordering, then maybe we should flag the session when adding a transaction id whether or not the commit order matters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I want to do this as a separate feature.

return vc.executeByOrder(method, query, bindVars, isDML, commitOrderPost)
}

func (vc *vcursorImpl) executeByOrder(method string, query string, bindVars map[string]*querypb.BindVariable, isDML bool, co commitOrder) (*sqltypes.Result, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of widening the VCursor interface into Execute, ExecutePre, and ExecutePost, it seems cleaner to make this the signature for a combined VindexExecute function.

This solves a couple problems IMO -- one is that callers already need to know Execute vs ExecutePre vs ExecutePost, and so exposing them all in one function instead of three seems reasonable.

The second is that I have always found it confusing that the VCursor interface defines a recursive Execute method that is called via the Executor only through the vindex functions. Calling it VindexExecute or ExecuteForVindex or some such would be clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover we could consider folding ExecuteAutocommit in here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went this route because we otherwise introduce a circular dependency on commitOrder: vtgate calls engine, which calls into vindexes, and the vindex calls back into VCursor (as interface).

We could move a commitOrder to vindexes and export it. Do you think that will still remain readable?

Or, I just thought of this: define it in the proto. I think proto looks like the best place.

@@ -91,9 +93,9 @@ func (lkp *lookupInternal) Verify(vcursor VCursor, ids, values []sqltypes.Value)
var err error
var result *sqltypes.Result
if lkp.Autocommit {
result, err = vcursor.ExecuteAutocommit("VindexVerify", lkp.ver, bindVars, true /* isDML */)
result, err = vcursor.ExecuteAutocommit("VindexVerify", lkp.ver, bindVars, false /* isDML */)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it was just a bug found during the course of doing this cleanup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. But it became significant for the feature: these failures would mark the transaction as partially complete and abort the transaction.

go/vt/vtgate/vindexes/vindex.go Outdated Show resolved Hide resolved

func (vc *vcursorImpl) executeByOrder(method string, query string, bindVars map[string]*querypb.BindVariable, isDML bool, co commitOrder) (*sqltypes.Result, error) {
vc.safeSession.SetCommitOrder(co)
defer vc.safeSession.SetCommitOrder(commitOrderNormal)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I'm not too worried about this, the way this is implemented seems like a potentially dangerous pattern since we need to match these two. It's not that different from Lock/Unlock, but an alternative might be to use a facade:

The underlying implementation would still have the three arrays of ShardSession of course, but instead of a state variable indicating which one should be used, the facade would override and put txns in the right list based on the order.

Then the Executor would take a Session interface instead of directly taking a safeSession object, and we'd call into it with something like

qr, err := vc.executor.Execute(vc.ctx, method, NewSessionForCommitOrder(vc.safeSession, co), vc.marginComments.Leading+query+vc.marginComments.Trailing, bindVars)

On balance, it's probably not worth it but figured I'd just share.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach would have totally made sense if ShardSessions was a struct. Being a slice requires wrapping it, which makes the whole thing awkward.

case commitOrderPre:
session.PreSessions = append(session.PreSessions, shardSession)
case commitOrderPost:
session.PostSessions = append(session.PostSessions, shardSession)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't this:
session.PostSessions = append(shardSession, session.PostSessions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three session variables are completely independent of each other. You could potentially have two transactions going to the same shard, one from each pool. If so, the ones in the Pre session are committed first, and the ones from Post are committed last.

}

if session.autocommitState == autocommittable && len(session.ShardSessions) == 0 {
if session.autocommitState == autocommittable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

autocommittable is sometimes spelt with one t like in the func SetAutoCommitable. Do we want to change all instances to be consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rename the function.

Copy link
Contributor Author

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll soon push fixes to address review comments.

As for naming, I would argue that it's consistent for the use case it's meant to fulfill. Our guidelines have always recommended that you treat it as a hidden index. If so, it fulfills all the criteria.

Ideally, calling it consistent is bad naming. We should eventually deprecate the old lookups and call this one lookup, because the consistency should be taken for granted.

@@ -740,7 +740,9 @@ func (stc *ScatterConn) multiGo(
oneShard := func(rs *srvtopo.ResolvedShard, i int) {
var err error
startTime, statsKey := stc.startAction(name, rs.Target)
defer stc.endAction(startTime, allErrors, statsKey, &err, nil)
// Send a dummy session.
// TODO(sougou): plumb a real session through this call.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously made a dumb decision that allowed a SafeSession to be nil, which required me to check for it on every method. It turns out that this is the only use case where the nil value is used. So, I changed it to send an empty session which means the same thing.

The comment is a reminder to look up the stack to see if the real session could to be plumbed through. However, I suspect that we'll just remove this code when we get rid of V2.

return err
}

// Retain backward compatibility on commit order for the normal session.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I want to do this as a separate feature.

return vc.executeByOrder(method, query, bindVars, isDML, commitOrderPost)
}

func (vc *vcursorImpl) executeByOrder(method string, query string, bindVars map[string]*querypb.BindVariable, isDML bool, co commitOrder) (*sqltypes.Result, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went this route because we otherwise introduce a circular dependency on commitOrder: vtgate calls engine, which calls into vindexes, and the vindex calls back into VCursor (as interface).

We could move a commitOrder to vindexes and export it. Do you think that will still remain readable?

Or, I just thought of this: define it in the proto. I think proto looks like the best place.


func (vc *vcursorImpl) executeByOrder(method string, query string, bindVars map[string]*querypb.BindVariable, isDML bool, co commitOrder) (*sqltypes.Result, error) {
vc.safeSession.SetCommitOrder(co)
defer vc.safeSession.SetCommitOrder(commitOrderNormal)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach would have totally made sense if ShardSessions was a struct. Being a slice requires wrapping it, which makes the whole thing awkward.

@@ -91,9 +93,9 @@ func (lkp *lookupInternal) Verify(vcursor VCursor, ids, values []sqltypes.Value)
var err error
var result *sqltypes.Result
if lkp.Autocommit {
result, err = vcursor.ExecuteAutocommit("VindexVerify", lkp.ver, bindVars, true /* isDML */)
result, err = vcursor.ExecuteAutocommit("VindexVerify", lkp.ver, bindVars, false /* isDML */)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. But it became significant for the feature: these failures would mark the transaction as partially complete and abort the transaction.

The VCursor API is impressively simpler no.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
@sougou
Copy link
Contributor Author

sougou commented May 10, 2019

The VCursor API is impressively simpler now.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
@sougou sougou merged commit 948c251 into vitessio:master May 10, 2019
@sougou sougou deleted the ss-consistent-lookup branch May 11, 2019 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants