-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes visibility problems with dependency propagation #7028
Conversation
Previously even if we choose BEGIN;
CREATE SCHEMA s1; -- switches to sequential mode
SET citus.create_object_propagation TO 'deferred';
CREATE SCHEMA s2; -- deferred not respected
CREATE TABLE s2.tbl(id serial);
SELECT create_distributed_table('s2.tbl','id');
ERROR: schema "s2" does not exist
CONTEXT: while executing command on localhost:9701 Solution: I think this type of usage should not be common. I mean if user really needs the GUC, they put it just after the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new insights from #6614 this change makes sense.
Before the only (identified) downside of propagating dependencies was that the user loses performance benefits of citus' parallel execution. Now with the metadatasync always being enabled we also found a correctness issue, for which we need deferred propagation.
With this new insight the assumption described in the old comment doesn't hold anymore.
Thanks for addressing.
Tests need some love IMHO
3fa33f6
to
23c03a2
Compare
93f8dcb
to
39b69fa
Compare
67f3e2e
to
e8af9a1
Compare
e8af9a1
to
4b7ecf7
Compare
b9b2174
to
a768115
Compare
36186c1
to
a8467e5
Compare
Codecov Report
@@ Coverage Diff @@
## main #7028 +/- ##
==========================================
- Coverage 93.20% 92.97% -0.24%
==========================================
Files 274 274
Lines 59240 59324 +84
==========================================
- Hits 55217 55156 -61
- Misses 4023 4168 +145 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR looks pretty good. I have left some comments around the subtransaction sub system. I will need to test a bit more with that, as do your tests :P. I wanted to have a look at the sub transaction tests to understand better how they behave, but I don't see any added tests that seem to work with nested transactions.
Since one of my comments addresses the nested nature of subtransactions and an inconsistency I think it is paramount to have some test coverage of the codepaths there.
@@ -51,12 +51,6 @@ PostprocessCreateDistributedObjectFromCatalogStmt(Node *stmt, const char *queryS | |||
return NIL; | |||
} | |||
|
|||
/* check creation against multi-statement transaction policy */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you elaborate on the commment why we are doing this, instead of stating what we are doing.
The code tells the same, would be great if a reader here understands why we are checking against a coordinated transaction instead.
/* | ||
* Track the propagation of the distributed table and its sequences in the current | ||
* transaction. | ||
*/ | ||
TrackPropagatedTable(relationId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment tells the same as the function name, we can remove such comment
* in the current subtransaction. | ||
*/ | ||
void | ||
TrackPropagatedTable(Oid relationId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Givent he comment above, I wonder if we should add AndSequences
to the function name.
It is quite specific, and I think due to the sequences it is implemented as its own function. This makes the comment on the invocation site even less required and the code more understandable.
* propagated in the current transaction. | ||
*/ | ||
bool | ||
HasAnyDepInPropagatedObjects(const ObjectAddress *objectAddress) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HasAnyDepInPropagatedObjects(const ObjectAddress *objectAddress) | |
HasAnyDependencyInPropagatedObjects(const ObjectAddress *objectAddress) |
/* | ||
* We can propagate dependencies via the current user's metadata connection if | ||
* any dependency for the target is created in the current transaction. Our assumption | ||
* is that if we can find a dependency created in the current transaction, then current | ||
* user, most probably, has permissions to create the target object as well. Note | ||
* that, user still may not be able to create the target due to no permissions for | ||
* any of the dependencies. But this is ok since it should be rare. If we opted to | ||
* use outside transaction, then there would be visibility issue on outside | ||
* transaction as we propagated objects via metadata connection and they are invisible | ||
* to outside transaction until we locally commit. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great comment!
I do think we should be a bit more prescriptive in using should
or need to
instead of we can
. It is not that we can, but that it is important we do so.
/* | |
* We can propagate dependencies via the current user's metadata connection if | |
* any dependency for the target is created in the current transaction. Our assumption | |
* is that if we can find a dependency created in the current transaction, then current | |
* user, most probably, has permissions to create the target object as well. Note | |
* that, user still may not be able to create the target due to no permissions for | |
* any of the dependencies. But this is ok since it should be rare. If we opted to | |
* use outside transaction, then there would be visibility issue on outside | |
* transaction as we propagated objects via metadata connection and they are invisible | |
* to outside transaction until we locally commit. | |
*/ | |
/* | |
* We need to propagate dependencies via the current user's metadata connection if | |
* any dependency for the target is created in the current transaction. Our assumption | |
* is that if we rely on a dependency created in the current transaction the current | |
* user, most probably, has permissions to create the target object as well. Note | |
* that, user still may not be able to create the target due to no permissions for | |
* any of its dependencies. But this is ok since it should be rare. If we opted to | |
* use outside transaction, then there would be visibility issue on outside | |
* transaction as we propagated objects via metadata connection and they are invisible | |
* to outside transaction until we locally commit. | |
*/ |
@@ -261,6 +269,7 @@ InitializeTransactionManagement(void) | |||
8 * 1024, | |||
8 * 1024, | |||
8 * 1024); | |||
InitTransactionPropagatedObjectsHash(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason to invoke this initialization from an other initializer, instead of adding the call to _PG_init()
All Initialize...
functions called there initialize exactly 1 thing, without nested initilize calls. Would be great to keep that semantic.
PropagatedObjectsInTx = hash_create("Tx Propagated Objects", 1024, | ||
&info, hashFlags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You initialize with 1024 expected elements, any reason for this specific number. It seems quite excessive for most operations.
As per comment on hash_create
:
* Note: for a shared-memory hashtable, nelem needs to be a pretty good
* estimate, since we can't expand the table on the fly. But an unshared
* hashtable can be expanded on-the-fly, so it's better for nelem to be
* on the small side and let the table grow if it's exceeded. An overly
* large nelem will penalize hash_seq_search speed without buying much.
Would it make sense to reduce this to something like 16?
{ | ||
hash_search(PropagatedObjectsInTx, objectAddress, HASH_ENTER, NULL); | ||
} | ||
hash_delete_all(state->propagatedObjects); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that;
- this needs to happen irregardless of the transaction committing. We are popping the subtransaction from the stack and destroying its contents. There is no need to keep the entries when we aborted, which is what we currently do
- we don't only want to delete all entries, but also destroy the hash table. So instead of
hash_delete_all
I think we wanthash_destroy
/* | ||
* keep subtransaction's propagated objects at toplevel transaction | ||
* if subtransaction committed. | ||
*/ | ||
if (commit) | ||
{ | ||
HASH_SEQ_STATUS propagatedObjectsSeq; | ||
hash_seq_init(&propagatedObjectsSeq, state->propagatedObjects); | ||
ObjectAddress *objectAddress = NULL; | ||
while ((objectAddress = hash_seq_search(&propagatedObjectsSeq)) != NULL) | ||
{ | ||
hash_search(PropagatedObjectsInTx, objectAddress, HASH_ENTER, NULL); | ||
} | ||
hash_delete_all(state->propagatedObjects); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding of this block and the rest of the subtransaction sub system is that sub transactions are nested.
My assumption is that if an inner sub transaction commits, but its outer subtransaction aborts the earlier commited subtransaction becomes void as well.
With that in mind the block above seems to be a bit strange. We always move object addresses from a subtransaction into the root transaction on commit. This means that non-existing object addresses can be in the root transaction. Although this should not cause problems as object addresses should not be reused.
It seems strange we distinguish between commit and abort in this code it there are still routes where non-committed object addresses could end up in the root transaction.
I would suggest to either;
- document why it is ok to have non-exisiting object addresses in the root transaction and always add them there.
- if we want to go the subtransaction route, move the object addresses to the parent subtransaction here, instead of to the root. The inner most subtransaction on the stack would finally need to move the addresses to the root.
Now there is a subtile inconsistency in what the datastructures mean, and what we intend to do, compared to what we are actually doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I intended to move objects to parent transaction when the subtransaction commits. Refactored the code.
@@ -774,6 +785,7 @@ PushSubXact(SubTransactionId subId) | |||
SubXactContext *state = palloc(sizeof(SubXactContext)); | |||
state->subId = subId; | |||
state->setLocalCmds = activeSetStmts; | |||
state->propagatedObjects = CreateSubtransactionPropagatedObjectsHash(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given not every subtransaction will actually do DDL commands, actually I expect very little to do so, would it make sense to lazily initialize this hash table when we start recording propagated objects.
We could treat a NULL
as an empty set of propagated objects and simply skip iterating them.
8e017b2
to
984cfca
Compare
} | ||
|
||
int nestingLevel = list_length(activeSubXactContexts); | ||
if (nestingLevel == 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably be a bit defensive here, given the rest of the function expects the list to be of length >= 2
if (nestingLevel == 1) | |
if (nestingLevel <= 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also would suggest that the check above can be removed, since an empty list (NIL
) will return 0 as its length, being caught by this check as well and returning the same value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great progress, one more thing, around the reuse of the toplevel hash table. This could keep a lot of memory occupied if the session ever did a big ddl transaction (eg. migrations).
Lastly, could you update the PR description now that we keep the GUC?
PropagatedObjectsInTx = hash_create("Tx Propagated Objects", 16, | ||
&info, hashFlags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we allocate exactly one hash table for the length of the backend, I wonder if this could keep an excessive reservation on memory if a session is used for a transaction performing a lot of ddl commands, eg. a migration.
On its own this wouldn't cause issues, however if this is done via a pooler a very sizable table could be kept in memory quite long.
How complex do you think it would be to default the upper tx based hash table to a NULL
pointer and only create a table when a transaction involves (starts tracking object addresses) ddl commands that we want to track>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
- track propagated objects during transaction and decide if we should use coordinated transaction or outside transaction accordingly during dependency creation.
…reated in current transaction
…still can be useful.
…ransaction, \ - adds mmore tests with nested transactions.
…ns when we have data to store in them
50715c2
to
f6ccec1
Compare
**Problem:** Previously we always used an outside superuser connection to overcome permission issues for the current user while propagating dependencies. That has mainly 2 problems: 1. Visibility issues during dependency propagation, (metadata connection propagates some objects like a schema, and outside transaction does not see it and tries to create it again) 2. Security issues (it is preferrable to use current user's connection instead of extension superuser) **Solution (high level):** Now, we try to make a smarter decision on whether should we use an outside superuser connection or current user's metadata connection. We prefer using current user's connection if any of the objects, which is already propagated in the current transaction, is a dependency for a target object. We do that since we assume if current user has permissions to create the dependency, then it can most probably propagate the target as well. Our assumption is expected to hold most of the times but it can still be wrong. In those cases, transaction would fail and user should set the GUC `citus.create_object_propagation` to `deferred` to work around it. **Solution:** 1. We track all objects propagated in the current transaction (we can handle subtransactions), 2. We propagate dependencies via the current user's metadata connection if any dependency is created in the current transaction to address issues listed above. Otherwise, we still use an outside superuser connection. DESCRIPTION: Fixes some object propagation errors seen with transaction blocks. Fixes #6614 --------- Co-authored-by: Nils Dijk <nils@citusdata.com>
Problem:
Previously we always used an outside superuser connection to overcome permission issues for the current user while propagating dependencies. That has mainly 2 problems:
Solution (high level):
Now, we try to make a smarter decision on whether should we use an outside superuser connection or current user's metadata connection. We prefer using current user's connection if any of the objects, which is already propagated in the current transaction, is a dependency for a target object. We do that since we assume if current user has permissions to create the dependency, then it can most probably propagate the target as well.
Our assumption is expected to hold most of the times but it can still be wrong. In those cases, transaction would fail and user should set the GUC
citus.create_object_propagation
todeferred
to work around it.Solution:
DESCRIPTION: Fixes some object propagation errors seen with transaction blocks.
Fixes #6614