Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: don't apply disallow_full_table_scans to internal executors #137961

Merged
merged 1 commit into from
Jan 3, 2025

Conversation

michae2
Copy link
Collaborator

@michae2 michae2 commented Dec 24, 2024

We have various checks for disallow_full_table_scans:

  1. in connExecutor we fail the statement with an error if it uses a full scan
  2. in execbuild we fail the statement with an error if there is an index hint that causes a full scan
  3. in the coster we add hugeCost to full scans, to try and avoid them

We don't want disallow_full_table_scans to apply to anything done by internal executors. (1) was already checking whether this connExecutor was internal. (2) and (3) were not, which led to #137404. #137681 fixed this by setting disallow_full_table_scans in NewInternalSessionData, but we only use NewInternalSessionData for some uses of the internal executor.

This commit explicitly checks SessionData().Internal in (2) and (3) to match (1), so that we don't get any of the disallow_full_table_scans behavior for any use of the internal executor, including populating virtual tables.

This commit also adds SessionData().Internal to the memo staleness check.

This behavior can be observed with the following diff, which adds an index hint to the crdb_internal.system_jobs virtual table SQL:

diff --git a/pkg/sql/crdb_internal.go b/pkg/sql/crdb_internal.go
index 80a5342b56a..99ccb8f1611 100644
--- a/pkg/sql/crdb_internal.go
+++ b/pkg/sql/crdb_internal.go
@@ -975,7 +975,7 @@ SELECT
 DISTINCT(id), status, created, payload.value AS payload, progress.value AS progress,
 created_by_type, created_by_id, claim_session_id, claim_instance_id, num_runs, last_run, job_type
 FROM
-system.jobs AS j
+system.jobs@jobs_status_created_idx AS j
 LEFT JOIN system.job_info AS progress ON j.id = progress.job_id AND progress.info_key = 'legacy_progress'
 INNER JOIN system.job_info AS payload ON j.id = payload.job_id AND payload.info_key = 'legacy_payload'
 `

With this diff, the following SQL hits (2) and fails with an error, even though it uses the internal executor and thus should not error:

SET disallow_full_table_scans = on;
SELECT * FROM crdb_internal.jobs;

With this commit we no longer fail, matching the behavior without the index hint.

Informs: #137404
Informs: #123783

Release note: None

@michae2 michae2 requested review from spilchen, rytaft, yuzefovich and a team December 24, 2024 01:32
@michae2 michae2 requested a review from a team as a code owner December 24, 2024 01:32
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2, @rytaft, and @spilchen)


-- commits line 15 at r1:
Right, nice find! Should we revert #137681 then?


pkg/sql/opt/memo/memo.go line 202 at r1 (raw file):

	unsafeAllowTriggersModifyingCascades       bool
	legacyVarcharTyping                        bool
	internal                                   bool

nit: even though it seems like we use the corresponding proto names for variables here, I wonder whether for SessionData().Internal we should deviate from that and use something more descriptive, like internalExecutor.

Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added a test.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rytaft, @spilchen, and @yuzefovich)


-- commits line 15 at r1:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Right, nice find! Should we revert #137681 then?

I'm ok with keeping #137681. I think it communicates the intent better than this PR, and I wasn't planning on backporting this one. I just want the various disallow_full_table_scans checks to match each other.


pkg/sql/opt/memo/memo.go line 202 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: even though it seems like we use the corresponding proto names for variables here, I wonder whether for SessionData().Internal we should deviate from that and use something more descriptive, like internalExecutor.

I tried, but this created a surprising amount of code churn, so it didn't seem worth it.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test! :lgtm:

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @michae2, @rytaft, and @spilchen)


-- commits line 15 at r1:

Previously, michae2 (Michael Erickson) wrote…

I'm ok with keeping #137681. I think it communicates the intent better than this PR, and I wasn't planning on backporting this one. I just want the various disallow_full_table_scans checks to match each other.

Ack, makes sense.

@michae2 michae2 force-pushed the more_dfts_followup branch from 912b025 to 1919e7b Compare January 2, 2025 21:41
We have various checks for disallow_full_table_scans:
1. in connExecutor we fail the statement with an error if it uses a full
   scan
2. in execbuild we fail the statement with an error if there is an index
   hint that causes a full scan
3. in the coster we add hugeCost to full scans, to try and avoid them

We don't want disallow_full_table_scans to apply to anything done by
internal executors. (1) was already checking whether this connExecutor
was internal. (2) and (3) were not, which led to cockroachdb#137404. cockroachdb#137681 fixed
this by setting disallow_full_table_scans in `NewInternalSessionData`,
but we only use `NewInternalSessionData` for some uses of the internal
executor.

This commit explicitly checks `SessionData().Internal` in (2) and (3) to
match (1), so that we don't get any of the disallow_full_table_scans
behavior for any use of the internal executor, including populating
virtual tables.

This commit also adds `SessionData().Internal` to the memo staleness
check.

This behavior can be observed with the following diff, which adds an
index hint to the `crdb_internal.system_jobs` virtual table SQL:

```
diff --git a/pkg/sql/crdb_internal.go b/pkg/sql/crdb_internal.go
index 80a5342b56a..99ccb8f1611 100644
--- a/pkg/sql/crdb_internal.go
+++ b/pkg/sql/crdb_internal.go
@@ -975,7 +975,7 @@ SELECT
 DISTINCT(id), status, created, payload.value AS payload, progress.value AS progress,
 created_by_type, created_by_id, claim_session_id, claim_instance_id, num_runs, last_run, job_type
 FROM
-system.jobs AS j
+system.jobs@jobs_status_created_idx AS j
 LEFT JOIN system.job_info AS progress ON j.id = progress.job_id AND progress.info_key = 'legacy_progress'
 INNER JOIN system.job_info AS payload ON j.id = payload.job_id AND payload.info_key = 'legacy_payload'
 `
```

With this diff, the following SQL hits (2) and fails with an error, even
though it uses the internal executor and thus should not error:

```
SET disallow_full_table_scans = on;
SELECT * FROM crdb_internal.jobs;
```

With this commit we no longer fail, matching the behavior without the
index hint.

Informs: cockroachdb#137404
Informs: cockroachdb#123783

Release note: None
@michae2 michae2 force-pushed the more_dfts_followup branch from 1919e7b to 458460d Compare January 2, 2025 23:16
@michae2
Copy link
Collaborator Author

michae2 commented Jan 3, 2025

TFTR!

bors r=yuzefovich

@craig craig bot merged commit a7fb9cc into cockroachdb:master Jan 3, 2025
22 checks passed
@michae2 michae2 deleted the more_dfts_followup branch January 3, 2025 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants