Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AUTHZ] Add PaimonDataSourceV2RelationTableExtractor to fix paimon select cannot get database's problem #5592

Closed
wants to merge 3 commits into from

Conversation

davidyuan1223
Copy link
Contributor

@davidyuan1223 davidyuan1223 commented Oct 31, 2023

Why are the changes needed?

This pr is try to fix the #5590 's problem, but i get some problem, so need someone to help me. The problem desc:

this is the source commt #5590
i try to add PaimonDataSourceV2RelationTableExtractor in TableExtractor to make we can fix the question, but when i add the extractor and get table info, this is the root call extrcator code

case class RuleApplyRowFilter(spark: SparkSession) extends RuleHelper {
    override def apply(plan: LogicalPlan): LogicalPlan = {
        val newPlan = mapChildren(plan) {
            case p: RowFilterMarker => p
            case scan if isKnownScan(scan) && scan.resolved =>
                val tables = getScanSpec(scan).tables(scan, spark)
                tables.headOption.map(applyFilter(scan, _)).getOrElse(scan)
            case other => apply(other)
        }
        newPlan
    }

    private def applyFilter(
            plan: LogicalPlan,
            table: Table): LogicalPlan = {
        val are = AccessResource(ObjectType.TABLE, table.database.orNull, table.table, null)
        val art = AccessRequest(are, ugi, QUERY, AccessType.SELECT)
        val filterExpr = SparkRangerAdminPlugin.getFilterExpr(art).map(parse)
        val filtered = filterExpr.foldLeft(plan)((p, expr) => Filter(expr, RowFilterMarker(p)))
        filtered
    }
}

if we add two exector for the RelationV2 plan, we will get two tables, but the apply code only use the first table
if the tables is

Table(Some(paimon_catalog),None,table1,Some(admin))
Table(Some(paimon_catalog),Some(paimon_ns),table1,Some(admin))

the source code will only filter the first table result, is that right?

@bowenliang123 @pan3793 do you know how to solve this?

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before make a pull request

Was this patch authored or co-authored using generative AI tooling?

…'s AppendOnlyFileTable so that we can get database from this
@davidyuan1223 davidyuan1223 changed the title test [AUTHZ] Add PaimonDataSourceV2RelationTableExtractor to fix paimon select cannot get database's problem Nov 5, 2023
@davidyuan1223 davidyuan1223 reopened this Nov 5, 2023
@bowenliang123
Copy link
Contributor

bowenliang123 commented Nov 5, 2023

Cool. This would be important for Paimon support.

@davidyuan1223
Copy link
Contributor Author

Cool. This would important for Paimon support.

yes, i try to fix this, but the comment's code looks like is conflict with our update? The file scan_command_spec support list extractor

val tables = getScanSpec(scan).tables(scan, spark)
                tables.headOption.map(applyFilter(scan, _)).getOrElse(scan)

but the code only filter the first extractor's result

@davidyuan1223
Copy link
Contributor Author

davidyuan1223 commented Nov 5, 2023

未命名文件
This is the verify for createTableAs's select timeline picture, currently, the problem is:
We add a new TableExtractor for DataSourceV2Relation, when we scan the spec and call tables, it will call DataSourceV2RelationTableExtractor and PaimonDataSourceV2RelationTableExtractor, so they will return two result for tables.
And we will use the tables for a inputObjs, then add Request by the inputObjs, when we verify for the requestArray, anyone failed, the all application is failed.
I have a question, how can we make the verify not one failed make all failed? or how can we merge the two result from the same implementation's tableExtractor?
I think same plan's relation have many extractor is right,because the other ranger plugin may return different plan, but how can we get the right tableExtractor's result that we want?

@bowenliang123
Copy link
Contributor

I have a question, how can we make the verify not one failed make all failed? or how can we merge the two result from the same implementation's tableExtractor?

I don't quite understand your description or question here.
Extractors are targeted for extracting privilege objects which are then distinct before verified with policies in RuleAuthorization. Any violation surely failed the whole plan as all the privilege objects must be checked.

@davidyuan1223
Copy link
Contributor Author

I have a question, how can we make the verify not one failed make all failed? or how can we merge the two result from the same implementation's tableExtractor?

I don't quite understand your description or question here. Extractors are targeted for extracting privilege objects which are then distinct before verified with policies in RuleAuthorization. Any violation surely failed the whole plan as all the privilege objects must be checked.

because i debug in local env, i found the violation failed reason is the DataSourceV2RelationTableExtractor's targe, then it will not violation other, event though i get the paimon's right result, it also failed, how shoud i fix to make the violation success?

@davidyuan1223
Copy link
Contributor Author

i think paimon should not supported in the authz module currently, i have try to fix this problem, but the root cause is paimon, i have get database from paimon's path
image

But the test case also failed
image
@bowenliang123 @pan3793

@bowenliang123
Copy link
Contributor

I agree with you. This is an impact from upstream. Authz requires proper identifier from logical plan for privilege checks.

@davidyuan1223
Copy link
Contributor Author

I agree with you. This is an impact from upstream. Authz requires proper identifier from logical plan for privilege checks.

yes, maybe we need close the issue #5470

@bowenliang123
Copy link
Contributor

cc @SteNicholas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants