-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: session-scoped temporary objects do not work with tenant clusters #67401
Comments
Alright, so this should proceed in two phases. Both of which should only affect secondary tenants, making it okay to be working on it at this point.
cockroach/pkg/sql/temporary_schema.go Lines 477 to 479 in dbd810e
Then we need to orchestrate which node is going to do the work. There's a few options here. One is to not coordinate but just run the loop rarely and with a lot of jitter. I think that that would be totally totally fine. I think it's my preferred approach. Any contention should sort itself out. If we did want to coordinate we could use a singleton job. See
The second one went further and created a scheduled job to do the cleanup. That could be cool. I don't think it's important. |
So, in conclusion, the work item of 1. is to hook up the temporary cleaner for secondary tenants and make sure it works with an assumption that it's the only tenant. Then, for 2. we hook up the RPCs to ensure that we don't trash data that a live session is using. For hooking up those RPCs we're going to want to be rather careful that we have a complete view of the set of nodes. We may need to add some new uncached accessor for the set of instances. That set of instances is exposed by cockroach/pkg/sql/sqlinstance/sqlinstance.go Lines 34 to 39 in dbd810e
|
Fixes: cockroachdb#67401 Previously, the temporary table clean up did not execute for tenants. This was inadequate that temporary tables would last longer then the life span of user sessions. To address this, this patch adds support for cleaning up on a single tenant pod. Specifically removing checks for meta1 lease when under a tenant, and support for listing sessions. Release justification: low risk and fixes a tenant related bug Release note (bug fix): Temporary tables were not properly cleaned up for tenants.
Fixes: cockroachdb#67401 Previously, the temporary table clean up did not execute for tenants. This was inadequate that temporary tables would last longer then the life span of user sessions. To address this, this patch adds support for cleaning up on a single tenant pod. Specifically removing checks for meta1 lease when under a tenant, and support for listing sessions. Release justification: low risk and fixes a tenant related bug Release note (bug fix): Temporary tables were not properly cleaned up for tenants.
69486: sql: support for temporary table clean up for tenants r=fqazi a=fqazi Fixes: #67401 Previously, the temporary table clean-up did not execute for tenants. This was inadequate that temporary tables would last longer than the life span of user sessions. To address this, this patch adds support for cleaning up on a single tenant pod. Specifically removing checks for meta1 lease when under a tenant, and support for listing sessions. Release justification: low risk and fixes a tenant-related bug Release note (bug fix): Temporary tables were not properly cleaned up for tenants. 69789: sql: drop database cascade can fail resolving schemas r=fqazi a=fqazi Fixes: #69713 Previously, drop database cascade would drop the schema first, and then drop any objects under those schemas after. This was inadequate because as we look up any objects under the schemas, we may need to resolve types, which will may lead to a look up on the schema. If the schema is dropped then we will fail while resolving any types. To address this, this patch drops the objects under the schema first followed by the database. Release justification: Low risk bug fix for drop database cascade Release note (bug fix): Drop database cascade can fail while resolving a schema in a certain scenarios with the following error: "ERROR: error resolving referenced table ID <ID>: descriptor is being dropped" 69975: sql: fix interaction between stmt bundles and tracing r=yuzefovich a=yuzefovich Previously, we wouldn't generate the bundle if the verbose tracing was already enabled on the cluster because we wouldn't call `instrumentationHelper.Finish` where we actually generate the bundle. This would result in empty responses for `EXPLAIN ANALYZE (DEBUG)` as well as the requests for stmt diagnostics being stuck in "waiting" state. Fixes: #69398. Release note (bug fix): Previously, if the tracing (`sql.trace.txn.enable_threshold` cluster setting) was enabled on the cluster, the statement diagnostics collection (`EXPLAIN ANALYZE (DEBUG)`) wouldn't work. This is now fixed. Release justification: low-risk fix to a long-standing bug. Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Fixes: cockroachdb#67401 Previously, the temporary table clean up did not execute for tenants. This was inadequate that temporary tables would last longer then the life span of user sessions. To address this, this patch adds support for cleaning up on a single tenant pod. Specifically removing checks for meta1 lease when under a tenant, and support for listing sessions. Release justification: low risk and fixes a tenant related bug Release note (bug fix): Temporary tables were not properly cleaned up for tenants.
Reopening for backport. |
Repro:
Notice that already there's a warning in the log, despite not having created any temporary objects:
Now create a temporary table in the SQL CLI:
While still in the CLI, kill the SQL tenant process with ctrl-C. Then start it up again. Back in the SQL CLI:
EXPECTED: The temp table should be gone, since it was tied to the previous session, which got terminated.
ACTUAL: The temp table is still showing:
Even if we decide not to fix this right now, we need to at least disable temporary object creation in tenant clusters and stop spamming the logs (logs are filled with the warning message even when temp objects are not ever used). That change needs to be back-ported to 21.1.x.
Also, here's @ajwerner's commentary from Slack:
The text was updated successfully, but these errors were encountered: