Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: don't include RHS data in merge trigger #28661

Merged
merged 1 commit into from
Aug 16, 2018

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Aug 15, 2018

Now that we require ranges to be collocated during a merge and the RHS
replicas to be up-to-date before the merge commits, we no longer need to
include a snapshot of the RHS in the merge trigger. We know that the
copy of the data that already exists in the local store is perfectly
up-to-date.

So, stop sending the data in the merge trigger.

Release note: None

@benesch benesch requested review from bdarnell, tbg and a team August 15, 2018 20:32
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Now that we require ranges to be collocated during a merge and the RHS
replicas to be up-to-date before the merge commits, we no longer need to
include a snapshot of the RHS in the merge trigger. We know that the
copy of the data that already exists in the local store is perfectly
up-to-date.

So, stop sending the data in the merge trigger.

Release note: None
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 10 of 10 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/storage/batcheval/cmd_end_transaction.go, line 1010 at r1 (raw file):

	// copy from the RHS are the keys in the abort span, and we've already
	// accounted for those stats above.
	ms.Add(merge.RightMVCCStats)

If you fudge up something in these stats computations, which tests start to fail?

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 10 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

Copy link
Contributor Author

@benesch benesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r=bdarnell,tschottdorf

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/storage/batcheval/cmd_end_transaction.go, line 1010 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

If you fudge up something in these stats computations, which tests start to fail?

TestStoreRangeMergeStats, primarily. I haven't concocted a scenario yet which it fails to catch.

craig bot pushed a commit that referenced this pull request Aug 16, 2018
23885: kv: evict leaseholder on RPC error r=solongoron a=tschottdorf

This addresses a situation in which we would not evict a stale
leaseholder for a long time. Consider the replicas [s1,s2,s3] and
s1 is down but is the cached leaseholder, while s2 is the actual lease
holder. The RPC layer will try s1, get an RPC error, try s2 and succeed.
Since there is no NotLeaseHolderError involved, the cache would not get
updated, and so every request pays the overhead of trying s1 first.

WIP because needs testing.

Touches #23601.

Release note (bug fix): Improve request routing during node outages.

28609: opt: Make additional perf improvements r=andy-kimball a=andy-kimball

Make several more fixes:

1. Do not qualify column names in metadata, since that
requires expensive string formatting up-front (also cleanup
the factoring of this code, which had gotten messy).
2. Inline Metadata into Memo.
3. Inline logicalPropsBuilder into the Memo.

Together, these changes improve KV perf from:
```
Phases/kv-read/OptBuild  18.4µs ± 1%
```
to:
```
Phases/kv-read/OptBuild  17.8µs ± 1%
```

28661: storage: don't include RHS data in merge trigger r=bdarnell,tschottdorf a=benesch

Now that we require ranges to be collocated during a merge and the RHS
replicas to be up-to-date before the merge commits, we no longer need to
include a snapshot of the RHS in the merge trigger. We know that the
copy of the data that already exists in the local store is perfectly
up-to-date.

So, stop sending the data in the merge trigger.

Release note: None

28689: sqlbase: avoid using SERIAL in system tables r=knz a=knz

Needed for  #28575.

We'll soon want special behavior for SERIAL. We can't afford the
definition of system tables to be subject to a discussion about what
SERIAL means. So this patch ensures system tables don't use SERIAL.

Release note: None

Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com>
Co-authored-by: Nikhil Benesch <nikhil.benesch@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Aug 16, 2018

Build succeeded

@craig craig bot merged commit bb7426a into cockroachdb:master Aug 16, 2018
craig bot pushed a commit that referenced this pull request Aug 16, 2018
28684: storage: make subsumption atomic with merge commit application r=bdarnell,tschottdorf a=benesch

~Please review the first commit separately in #28661.~

During a merge, the subsumed range logically ceases to exist at the
moment that the merge transaction commit applies. It's important that we
remove the subsumed range from disk in the same batch as the merge
commit to protect against an ill-timed crash. Tease apart
Replica.destroyRaftMuLocked to make this possible.

Release note: None

28707: server: don't drain when decommissioning r=nvanbenschoten a=tschottdorf

Prior to this commit, a server which enters decommissioning would
automatically drain. This wasn't a great idea since it meant that
pointing the decommissioning command at a set of nodes required
for the cluster to work would brick that cluster, and would be
difficult to recover from (since the draining state is picked
up by the nodes when they start).

Instead, decouple draining from decommissioning. Decommissioning simply
tells the allocator to move data off that node; when the node gets shut
down cleanly (the operator's responsiblity), it will drain.

The resulting behavior when trying to decommission a too-large set of
nodes is now that the decommission command will simply not finish.
For example, starting a three node cluster and decommissioning all
nodes will simply hang (though the nodes will be marked as
decommissioning). Add three more nodes to the cluster and replicas
will move over to the newly added nodes, and the decommissioning
command will finish.

As a bonus, recommissioning a node now doesn't require the target
node to restart.

As a second bonus, the decommissioning acceptance test now takes
around 60% of the previous time (~45s down from 70+).

One remaining caveat is that users may forget that they attempted
to decommission nodes. We need to check that we prominently alert
in the UI when nodes are decommissioning. This isn't a new problem.

Fixes #27444.
Fixes #27025.

Release note (bug fix): decommissioning multiple nodes is now possible
without posing a risk to cluster health. Recommissioning a node does
no longer require a restart of the target nodes to take effect.

Co-authored-by: Nikhil Benesch <nikhil.benesch@gmail.com>
Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
@benesch benesch deleted the merge-efficiency branch August 17, 2018 01:44
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Aug 29, 2019
Prior to cockroachdb#28661, we used to include a snapshot of the RHS in the merge
trigger. Due to this, we declared a read only span over the RHS data
range. Now that we require ranges to be collocated during merges, we
didn't need to include the snapshot nor declare the span.

Orthogonal to this are Subsume requests that block out concurrent
commands to the RHS, with the appropriate span declarations contained
therein.

Release note: None
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Sep 3, 2019
Prior to cockroachdb#28661, we used to include a snapshot of the RHS in the merge
trigger. Due to this, we declared a read only span over the RHS data
range. Now that we require ranges to be collocated during merges, we
didn't need to include the snapshot nor declare the span.

Orthogonal to this are Subsume requests that block out concurrent
commands to the RHS, with the appropriate span declarations contained
therein.

Release note: None
craig bot pushed a commit that referenced this pull request Sep 3, 2019
40261: storage: remove span declaration for merges r=irfansharif a=irfansharif

Prior to #28661, we used to include a snapshot of the RHS in the merge
trigger. Due to this, we declared a read only span over the RHS data
range. Now that we require ranges to be collocated during merges, we
didn't need to include the snapshot nor declare the span. 
Also informs #32583.

Orthogonal to this are Subsume requests that block out concurrent
commands to the RHS, with the appropriate span declarations contained
therein.

Release note: None

Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Nov 11, 2019
Prior to cockroachdb#28661, we used to include a snapshot of the RHS in the merge
trigger. Due to this, we declared a read only span over the RHS data
range. After we required ranges to be collocated during merges, we
didn't need to include the snapshot, so now now longer need to declare
the span.

Orthogonal to this are Subsume requests that block out concurrent
commands to the RHS, with the appropriate span declarations contained
therein.

Release note: None
craig bot pushed a commit that referenced this pull request Nov 11, 2019
41036: storage: remove span declaration for merges r=ajwerner a=irfansharif

I'm just re-sending #40261 (reverted in #40446). These changes are 
intended for 20.1 and will not be merged until after branching.

---

Prior to #28661, we used to include a snapshot of the RHS in the merge
trigger. Due to this, we declared a read only span over the RHS data
range. Now that we require ranges to be collocated during merges, we
didn't need to include the snapshot nor declare the span.

Orthogonal to this are Subsume requests that block out concurrent
commands to the RHS, with the appropriate span declarations contained
therein.

Release justification: N/A
Release note: None

Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants