Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AUTHZ] Support CreateTableAs command for Paimon #5590

Closed

Conversation

bowenliang123
Copy link
Contributor

@bowenliang123 bowenliang123 commented Oct 31, 2023

Why are the changes needed?

Support CreateTableAs command for Paimon

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before make a pull request

Was this patch authored or co-authored using generative AI tooling?

No.

@codecov-commenter
Copy link

codecov-commenter commented Oct 31, 2023

Codecov Report

Merging #5590 (4287e90) into master (c24e984) will increase coverage by 0.31%.
Report is 14 commits behind head on master.
The diff coverage is n/a.

❗ Current head 4287e90 differs from pull request most recent head c997b63. Consider uploading reports for the commit c997b63 to get more accurate results

@@             Coverage Diff              @@
##             master    #5590      +/-   ##
============================================
+ Coverage     61.35%   61.67%   +0.31%     
  Complexity       23       23              
============================================
  Files           600      603       +3     
  Lines         34316    35670    +1354     
  Branches       4501     4869     +368     
============================================
+ Hits          21054    21998     +944     
- Misses        11128    11290     +162     
- Partials       2134     2382     +248     

see 63 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@davidyuan1223
Copy link
Contributor

davidyuan1223 commented Oct 31, 2023

I want ask a question, why the admin to execute doAs in the next, don't use sql to execute, i found your code is doAs(admin, createTableAs), why not is doAs(admin,sql(createTableAs)).
And i found the DeltaCatalogRanger also use doAs(user,mySql),but the HudiCatalog use doAs(user,sql(mySql)).

doAs code is

  protected def doAs[T](user: String, f: => T): T = {
    UserGroupInformation.createRemoteUser(user).doAs[T](
      new PrivilegedExceptionAction[T] {
        override def run(): T = f
      })
  }

If we not use spark.sql to execute our sql, it's only a string format???

Which is the right way???

@pan3793 @wForget @yaooqinn could you give some advice

@yaooqinn
Copy link
Member

yaooqinn commented Nov 1, 2023

@davidyuan1223 I believe you are right, and @bowenliang123 made a mistake. Nice catch.

@davidyuan1223
Copy link
Contributor

davidyuan1223 commented Nov 1, 2023

@davidyuan1223 I believe you are right, and @bowenliang123 made a mistake. Nice catch.

And I found the DeltaCatalogRanger also is this code style, I think you need to check the authz module's test case to avoid this problem. This problem just like this comment, I want know why admin create table but it can not select from the table?

@bowenliang123
Copy link
Contributor Author

Sorry about that. Will fix and have further investigation.

@bowenliang123 bowenliang123 marked this pull request as draft November 1, 2023 03:53
@bowenliang123
Copy link
Contributor Author

bowenliang123 commented Nov 1, 2023

@davidyuan1223 More investigation in the logical plan is required to find the reason. Maybe missing fields in parsed privilege objects. Let me update the test case for Paimon first. And thanks for mentioning possible bug for Delta test suite as well.

@davidyuan1223
Copy link
Contributor

@davidyuan1223 More investigation in the logical plan is required to find the reason. Maybe missing fields in parsed privilege objects. Let me update the test case for Paimon first. And thanks for mentioning possible bug for Delta test suite as well.

thanks your reply, if you find the root cause about this bug, wish you can teach me how to fix this, so that I can complete the issue #5470

@bowenliang123
Copy link
Contributor Author

bowenliang123 commented Nov 1, 2023

@davidyuan1223 More investigation in the logical plan is required to find the reason. Maybe missing fields in parsed privilege objects. Let me update the test case for Paimon first. And thanks for mentioning possible bug for Delta test suite as well.

thanks your reply, if you find the root cause about this bug, wish you can teach me how to fix this, so that I can complete the issue #5470

Sure. Will ping you when I came to any conclusion in the reason. Thanks for creating a bug issue for it.

@bowenliang123
Copy link
Contributor Author

The initial reason is that missing database field in RangerAccessRequest, and the admin user is granted on db-table-column *-*-* resource rules.

image

@davidyuan1223
Copy link
Contributor

RangerAccessRequest

LGTM, I will debug in my env

@davidyuan1223
Copy link
Contributor

The initial reason is that missing database field in RangerAccessRequest, and the admin user is granted on db-table-column *-*-* resource rules.

image

this is the logic plan about createTableAs, looks like it's plan have some problem

CreateTableAsSelect org.apache.paimon.spark.SparkCatalog@4a5f435b, paimon_ns.table2, [provider=PAIMON], true
+- RelationV2[id#6, name#7, city#8] table1

@bowenliang123
Copy link
Contributor Author

Yes. Will have a dive-in debugging in extractors later.

@bowenliang123
Copy link
Contributor Author

Seems that there's a Paimon's org.apache.paimon.AppendOnlyFileStore as the Spark Table inside the input query plan, which is unsupported by the DataSourceV2RelationTableExtractor.

image

@davidyuan1223
Copy link
Contributor

Seems that there's a Paimon's org.apache.paimon.AppendOnlyFileStore as the Spark Table inside the input query plan, which is unsupported by the DataSourceV2RelationTableExtractor.

image

looks like the paimon can not supported in kyuubi's authz,? we can not make sure the paimon's plan is same as spark's plan

@bowenliang123
Copy link
Contributor Author

bowenliang123 commented Nov 2, 2023

It's normal to see lakehouse has individual implementation of spark plan in some commands. They're still part of spark plans. To support them, we could adapt our extractor for the details of the plan. But the precondition is that table identifier is carried by the plan. Should have look at filestore class of paimon for that.

@davidyuan1223
Copy link
Contributor

It's normal to see lakehouse has individual implementation of spark plan in some commands. They're still part of spark plans. To support them, we could adapt our extractor for the details of the plan. But the precondition is that table identifier is carried by the plan. Should have look at filestore class of paimon for that.

image
The paimon's FileStoreTable top interface's method name is only get the tableName, what can we do to improve the paimon could return database? The source code in 0.5 only will return like xxx.db/tableName to tableName

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants