Skip to content

ttljob: wire up range_concurrency and cleanup query construction#76487

Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom
otan-cockroach:concurrency
Feb 17, 2022
Merged

ttljob: wire up range_concurrency and cleanup query construction#76487
craig[bot] merged 3 commits intocockroachdb:masterfrom
otan-cockroach:concurrency

Conversation

@otan
Copy link
Contributor

@otan otan commented Feb 14, 2022

See individual commits for details.

@otan otan requested a review from rafiss February 14, 2022 03:59
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@otan otan mentioned this pull request Feb 14, 2022
30 tasks
@otan otan force-pushed the concurrency branch 2 times, most recently from a6aa49a to 842db73 Compare February 14, 2022 07:19
numNonExpiredRows: 5,
},
{
desc: "one column pk, concurrent",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file was a move from ttljob_integration_test.go; only the concurrent tests are new

@otan otan marked this pull request as ready for review February 14, 2022 19:54
@otan otan requested a review from a team February 14, 2022 19:54
@otan otan force-pushed the concurrency branch 4 times, most recently from 825c261 to 22eb8de Compare February 15, 2022 20:34
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor nits

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan and @rafiss)


pkg/sql/paramparse/paramobserver.go, line 323 at r6 (raw file):

		onReset: func(po *TableStorageParamObserver, evalCtx *tree.EvalContext, key string) error {
			if po.tableDesc.RowLevelTTL != nil {
				po.tableDesc.RowLevelTTL.RangeConcurrency = 0

should it reset to 1? (and if so, needs a test)


pkg/sql/ttl/ttljob/ttljob.go, line 110 at r6 (raw file):

		// early as the delete will not work.
		if desc.GetModificationTime().GoTime().After(aost.Time) {
			return errors.Newf("found a recent schema change on the table, aborting")

is there any use in including the timestamps in the error message?


pkg/sql/ttl/ttljob/ttljob.go, line 142 at r6 (raw file):

		startPK, endPK tree.Datums
	}
	ch := make(chan rangeToProcess)

should this channel be size-capped at the level of concurrency?


pkg/sql/ttl/ttljob/ttljob.go, line 178 at r6 (raw file):

	if err := func() error {
		defer close(ch)

i feel like it might be nicer to not have to call g.Wait in two separate places. what about

func() (retErr error) {
  defer func() { retErr = errors.CombineErrors(retErr, g.Wait()) }()
  defer close(ch)
....

pkg/sql/ttl/ttljob/ttljob.go, line 183 at r6 (raw file):

				return err
			}
			var r rangeToProcess

is r shadowed here on purpose? if not, would be good to use some descriptive names


pkg/sql/ttl/ttljob/ttljob_query_builder.go, line 36 at r6 (raw file):

	aost            tree.DTimestampTZ

	// isFirst is true if we have no invoked a query using the builder yet.

nit: no->not


pkg/sql/ttl/ttljob/ttljob_query_builder.go, line 40 at r6 (raw file):

	// queryCache is the cached query, which stays the same from the second
	// iteration onwards.
	queryCache string

nit: maybe rename to cachedQuery and cachedArgs? the phrasing now makes it seem like they are caches, but they actually are cached data


pkg/sql/ttl/ttljob/ttljob_query_builder_test.go, line 271 at r6 (raw file):

	}

	for _, tc := range testCases {

nice tests

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan and @rafiss)


pkg/sql/exec_util.go, line 1479 at r6 (raw file):

	// MockDescriptorVersionDuringDelete is a version to mock the delete descriptor
	// as during delete.
	MockDescriptorVersionDuringDelete *descpb.DescriptorVersion

do these two knobs need to be pointers? (or is that a convention?)

Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


pkg/sql/exec_util.go, line 1479 at r6 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

do these two knobs need to be pointers? (or is that a convention?)

it's to differentiate do not set vs run and set. i've seen it in a few knobs.


pkg/sql/paramparse/paramobserver.go, line 323 at r6 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

should it reset to 1? (and if so, needs a test)

nah it's 0. if it is 0, we use the cluster setting default.


pkg/sql/ttl/ttljob/ttljob.go, line 110 at r6 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

is there any use in including the timestamps in the error message?

Done.


pkg/sql/ttl/ttljob/ttljob.go, line 142 at r6 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

should this channel be size-capped at the level of concurrency?

Done.


pkg/sql/ttl/ttljob/ttljob.go, line 178 at r6 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

i feel like it might be nicer to not have to call g.Wait in two separate places. what about

func() (retErr error) {
  defer func() { retErr = errors.CombineErrors(retErr, g.Wait()) }()
  defer close(ch)
....

that's more prone to bugs imo. if retErr is nil, CombinedErrors will be nil even if g.Wait() is not nil!


pkg/sql/ttl/ttljob/ttljob.go, line 183 at r6 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

is r shadowed here on purpose? if not, would be good to use some descriptive names

Done. (Yes but descriptive names are good)


pkg/sql/ttl/ttljob/ttljob_query_builder_test.go, line 271 at r6 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

nice tests

danke!

This commit adds the ability to configure how many ranges to scan at
once for each TTL job.

Release note: None
@otan otan force-pushed the concurrency branch 2 times, most recently from acc5fdd to 64a8ce6 Compare February 16, 2022 19:33
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! still an optional nit on CombineErrors usage

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan and @rafiss)


pkg/sql/ttl/ttljob/ttljob.go, line 178 at r6 (raw file):

Previously, otan (Oliver Tan) wrote…

that's more prone to bugs imo. if retErr is nil, CombinedErrors will be nil even if g.Wait() is not nil!

that's how Wrap works. CombineErrors does what we want

// CombineErrors returns err, or, if err is nil, otherErr.
// if err is non-nil, otherErr is attached as secondary error.

we rely on this pattern a lot. e.g.

defer func(it sqlutil.InternalRows) { retErr = errors.CombineErrors(retErr, it.Close()) }(it)

otan added 2 commits February 17, 2022 17:12
This commit changes the TTL job to run one range at a time, with the
option of running with concurrency.

We also split the query generation into builders, which caches query
strings which may be continually re-used as well as re-using the same
buffer space for arg allocations. This saves some CPU time.

Release note: None
@otan
Copy link
Contributor Author

otan commented Feb 17, 2022

done

bors r=rafiss

@craig
Copy link
Contributor

craig bot commented Feb 17, 2022

Canceled.

@otan
Copy link
Contributor Author

otan commented Feb 17, 2022

bors r=rafiss

@craig
Copy link
Contributor

craig bot commented Feb 17, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 17, 2022

Build succeeded:

@craig craig bot merged commit ce9a343 into cockroachdb:master Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments