Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3: instant-commit for autocommit #3559

Merged
merged 1 commit into from
Jan 24, 2018
Merged

Conversation

sougou
Copy link
Contributor

@sougou sougou commented Jan 17, 2018

If we are in autocommit mode and vtgate does not break a DML into smaller parts, then it has the opportunity to send that statement through to a vttablet as autocommit in a single round-trip.

Reviewer instructions:
@demmer: I'm not too sure about the vtexplain fixes, or if additional tests are required there. Extra scrutiny may be required there.

Implementation notes:

  • SafeSession has a state machine for tracking autocommit state.
  • The autocommit state is initialized by executor as needed.
  • VCursor API has been changed for ExecMultiShard. It now accept an extra canCommit flag that should be set to true if the engine is executing its final DML. This, combined with the autocommit state will decide if an instant autocommit is possible.

@@ -333,7 +333,8 @@ func (route *Route) execute(vcursor VCursor, bindVars, joinVars map[string]*quer
}

shardQueries := route.getShardQueries(route.Query, params)
result, err := vcursor.ExecuteMultiShard(params.ks, shardQueries, isDML)
// isFinal should be true only if it's a DML.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't necessarily obvious -- I think the point is that the only way to get here is for a single-shard (or unsharded) query and therefore isFinal is true in those cases.

Could we just set it to be unconditionally true here and then handle the case where isDML is false but isFinal is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the isFinal name is misleading. It should actually not be true if we're performing reads. This flag eventually reaches scatter-conn, and if it's set, it goes through the instant-commit code path, which is wasteful for reads.

I'll rename it to canCommit, which implies that it's relevant only for DMLs.

@@ -202,6 +212,8 @@ func (e *Executor) execute(ctx context.Context, safeSession *SafeSession, sql st
func (e *Executor) handleExec(ctx context.Context, safeSession *SafeSession, sql string, bindVars map[string]*querypb.BindVariable, target querypb.Target, logStats *LogStats) (*sqltypes.Result, error) {
if target.Shard != "" {
// V1 mode or V3 mode with a forced shard target
// TODO(sougou): change this flow to go through V3 functions
// which will allow us to benefit from the autocommitable flag.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even without routing through v3 it seems like we could make this change, no?

I agree that having something like PlanShardPassthrough would make sense and then we can reduce this duplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolver goes through V2 code, and it should not contain any reference to autocommit. Introducing such awareness there is risk-prone.

Yeah. The better approach will be to have an opcode for this in V3. We'll still need to distinguish reads from writes because of the canCommit flag (to be renamed from isFinal)

type autocommitState int

const (
notAutocommitable = autocommitState(iota)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit -- these should all be notAutocommittable, autocommittable, and autocommitted to match the actual english spelling of "committed" / "committable".

// AutocommitToken returns true if we can perform a single round-trip
// autocommit. If so, the caller is responsible for commiting their
// transaction.
func (session *SafeSession) AutocommitToken() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it clearer if this was simply named Autocommit() or AutocommitIfPossible(). The use of "Token" suggests that the session will allocate some kind of nonce or other token, not return a boolean and do a state transition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of CanAutocommit, or Autocommittable. The problem with these names is that they imply that it's a read-only function. In this case, the call causes the session to change state disallowing further transactional operations, and it essentially hands over the commit responsibility to the caller.

Can't think of a better name yet.

@@ -124,6 +189,8 @@ func (session *SafeSession) Reset() {
}
session.mu.Lock()
defer session.mu.Unlock()
session.mustRollback = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this mustRollback change a regression or somehow necessary only due to the autocommit changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added it for future-proofing (resetting all members to zero).

Sql: sql,
BindVariables: bindVariables,
}}
qrs, err := stc.gateway.ExecuteBatch(ctx, target, queries, true /* asTransaction */, 0, options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be worth capturing in a comment that our use of ExecuteBatch is a stopgap because there is not yet a good way to do autocommit with a regular Execute.

@@ -55,8 +54,6 @@ var masterSession = &vtgatepb.Session{
}

func init() {
flag.CommandLine.Parse([]string{}) // prevents glog "ERROR: logging before flag.Parse"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should go in as a separate PR.

@@ -342,7 +342,7 @@ func (t *fakeVcursor) Execute(method string, query string, bindvars map[string]*
panic("unimplemented")
}

func (t *fakeVcursor) ExecuteMultiShard(keyspace string, shardQueries map[string]*querypb.BoundQuery, isDML bool) (*sqltypes.Result, error) {
func (t *fakeVcursor) ExecuteMultiShard(keyspace string, shardQueries map[string]*querypb.BoundQuery, isDML, isFinal bool) (*sqltypes.Result, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like isFinal is non-obvious -- it seems better to use something more explicit like autocommitSafe which would capture things more clearly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was held in the GH Review but I mistakenly sent some as single comments. Nevertheless it seems like you agree and I think canCommit is a sensible name.

var err error

switch {
case autocommit:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tight interplay between the autocommit state and the various layers of notInTransaction and shouldBegin causes me a bit of concern here.

Would it maybe make more sense to move the autocommit check inside multiGoTransaction and then change the callback to handle the three possibilities more explicitly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiGoTransaction is shared with V2 code. So, we can't touch it. Which reminds me: I should add a comment that this is a V3 only function.

Fortunately, notInTransaction is a V2-only thing. We can assume that it's always true for V3.

So, the only interaction is with shouldBegin. In my initial iteration, I had the autocommit check inside the shouldBegin block, implying that autocommit can be relevant only if shouldBegin is true.

The downside of that approach: we obtain the 'token' to commit from the session. But then, what if shouldBegin is somehow false? In that case we don't commit anything. So, I thought it's better to make it obvious that autocommit will always happen if the flag is set.

I'm wondering if we should make a V3 version of multiGoTransaction instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think it would be great if the v3 code paths were a bit more isolated from v2 since we will occasionally continue to run into these issues where it's unclear what support is needed for which mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this some more. The multiGoTransaction function was a little too big to duplicate. So, I think it's worth reusing for now. Longer term, we'll get to clean this up when we deprecate v2.

Copy link
Member

@demmer demmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good -- I had some suggestions about style / naming but other than that I think the overall approach looks good.

Copy link
Contributor Author

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded to some. Will address the rest.

// AutocommitToken returns true if we can perform a single round-trip
// autocommit. If so, the caller is responsible for commiting their
// transaction.
func (session *SafeSession) AutocommitToken() bool {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of CanAutocommit, or Autocommittable. The problem with these names is that they imply that it's a read-only function. In this case, the call causes the session to change state disallowing further transactional operations, and it essentially hands over the commit responsibility to the caller.

Can't think of a better name yet.

@@ -124,6 +189,8 @@ func (session *SafeSession) Reset() {
}
session.mu.Lock()
defer session.mu.Unlock()
session.mustRollback = false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added it for future-proofing (resetting all members to zero).

var err error

switch {
case autocommit:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiGoTransaction is shared with V2 code. So, we can't touch it. Which reminds me: I should add a comment that this is a V3 only function.

Fortunately, notInTransaction is a V2-only thing. We can assume that it's always true for V3.

So, the only interaction is with shouldBegin. In my initial iteration, I had the autocommit check inside the shouldBegin block, implying that autocommit can be relevant only if shouldBegin is true.

The downside of that approach: we obtain the 'token' to commit from the session. But then, what if shouldBegin is somehow false? In that case we don't commit anything. So, I thought it's better to make it obvious that autocommit will always happen if the flag is set.

I'm wondering if we should make a V3 version of multiGoTransaction instead.

@sougou sougou force-pushed the autocommit branch 2 times, most recently from 9d69d71 to 8cf4e68 Compare January 18, 2018 17:49
@sougou sougou changed the title WIP: instant-commit for autocommit v3: instant-commit for autocommit Jan 18, 2018
@sougou sougou requested review from demmer and rafael January 18, 2018 17:51
@sougou
Copy link
Contributor Author

sougou commented Jan 18, 2018

This is now ready for final review.

Copy link
Member

@demmer demmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking real close -- just a couple of minor changes and I think this is good to go.

@@ -82,11 +81,11 @@
},
{
"Time": 1,
"SQL": "insert into t1(id, intval, floatval) values (1, 2, 3.14)"
"SQL": "commit"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This output change isn't actually a correct reflection of the execution order, since the misguided attempt to improve test stability by sorting queries in the same time cycle had the downside of making this no longer reflect the logical order of statements.

This, among other things, is fixed in #3532 so I think it would be best to merge that first, then re-run the vtexplain tests to generate what should hopefully be a more logical output diff in which all I would expect here is the time to change from 2 to 1.

// to make sure that single round-trip commits are executed
// correctly whenever possible.

// TestAutocommitUpdateSharded: insant-commit.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "insant-commit" => "instant-commit" throughout this file

testBatchQuery(t, "sbc2", sbc2, nil)

// This is the only test where we verify transaction count
if got, want := sbc1.AsTransactionCount.Get(), int64(1); got != want {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally like to see a call to testCommitCount(t, "sbc1", sbc1, 0) in all these cases where we expect instant-commit to hold.

It would make sure that any future changes in vtgate don't send a commit anyway even if AsTransaction was true in these cases.

@@ -176,6 +177,8 @@ func (stc *ScatterConn) ExecuteMultiShard(
shards = append(shards, shard)
}

autocommit := len(shards) == 1 && canCommit && session.AutocommitApproval()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming is hard and annoying, but IMO canAutocommit would be clearer than canCommit for the flag that we pass down through the various layers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually not referring to the local variable autocommit here, but more the canCommit parameter that is plumbed down throughout various call sites and other places.

That (IMO) would be clearer if it was canAutocommit but it's a minor point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. There is a good reason why that is named 'canCommit'. It's futuristic: if an autocommit yields multiple statements within the same shard, then the final statement can be execute-committed immediately. Another round-trip saved.

If you look at the logic in engine, this meaning is upheld throughout the code. Will we ever use it that way? Only time will tell.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're debating semantics at this point but I would argue even in that case you are still autocommitting...

That saidl I now have actual code to write so this is the last I will speak of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point now. I'll rename it :).

var err error

switch {
case autocommit:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think it would be great if the v3 code paths were a bit more isolated from v2 since we will occasionally continue to run into these issues where it's unclear what support is needed for which mode.

Copy link
Contributor Author

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • rebased and fixed vtexplain tests
  • one response
  • rest of the comments addressed

var err error

switch {
case autocommit:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this some more. The multiGoTransaction function was a little too big to duplicate. So, I think it's worth reusing for now. Longer term, we'll get to clean this up when we deprecate v2.

If we are in autocommit mode and vtgate does not break a DML into
smaller parts, then it has the opportunity to send that statement
through to a vttablet as autocommit in a single round-trip.

Reviewer instructions:
@demmer: I'm not too sure about the vtexplain fixes, or if additional
tests are required there. Extra scrutiny may be required there.

Implementation notes:
* SafeSession has a state machine for tracking autocommit state.
* The autocommit state is initialized by executor as needed.
* VCursor API has been changed for ExecMultiShard. It now accepts
  an extra canCommit flag that should be set to true if the engine
  is executing its final DML. This, combined with the autocommit
  state will decide if an instant autocommit is possible.
@sougou sougou merged commit 3e39b05 into vitessio:master Jan 24, 2018
@sougou sougou deleted the autocommit branch January 25, 2018 13:29
systay pushed a commit that referenced this pull request Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants