Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a single projection for multiple column masks #14420
Create a single projection for multiple column masks #14420
Changes from all commits
bd22fd7
ec2c794
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the TODO and test name is incorrect. I would simply remove TODO and name test to
testMultipleColumnMasks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a change in semantics. Looking at this, I realize we never settled the question of why we'd ever have multiple masks on a single column: #11654 (comment). That still doesn't make a lot of sense to me. cc @kokosing @ksobolew
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have multiple system access controls and each of them can return mask for a given column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are the masks expected to be applied in such a case? There's no guarantee about the order in which the system access controls are invoked.
Also, you're talking about multiple system access controls, but the SystemAccessControl interface allows a single access control interface to return multiple masks for a given column. That does not make much sense -- if only one mask is to be applied, the interface should return a single mask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are multiple situations in which more than one mask may apply to a single column. The simplest example is when there are two roles enabled and each of them applies a different column mask on a particular column. This immediately raises the question of ordering, of course. While it looks like Trino does not make such guarantees explicitly, the ordering is in practice deterministic as long as the access controls return them in a deterministic order. The masks and filters are returned and processed as
List
s, and this implies ordering. The engine has no trouble dealing with this and applies the masks iteratively.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That still doesn't make a lot of sense to me. Masks are defined in terms of the values in the column. For example, if column is a string that contains only numbers, one might implement a mask function by parsing the number, doing some math (a hash, etc) on it and converting it back to a string. If the the engine applies the function on anything other than the original values (e.g., because another mask got injected in between), the mask will fail. This makes it hard for someone writing a mask function to reason about, as it requires non-local understanding of how every role, user and mask relate to each other.
If multiple masks are possible, it should be up to the connector to decide which one is the most appropriate and go with it. In the case of multiple system access controls returning masks, there are two options:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this now fail with
.hasMessageMatching("Multiple masks on a single column is not supported: column");
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That only happens after the second commit. Perhaps they should be reordered, so that the prohibition on multiple masks goes first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH, it does pass in the CI apparently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see - the test is using
TestingAccessControlManager
, which replaces the actualAccessControlManager
and does not check for multiple masks.