-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-36400][SPARK-36398][SQL][WEBUI] Make ThriftServer recognize spark.sql.redaction.string.regex #33743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #142467 has finished for PR 33743 at commit
|
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Hi @sarutak, |
|
Yeah looks like it. |
|
cc @juliuszsompolski and @wangyum FYI |
|
@dnskr Thank you for letting me know. Yes, SPARK-36398 seems to be resolved by this change too. |
|
Looks ok to me. Could we add a test for this change? |
|
@wangyum Yeah it's better to have, but |
|
Merging to |
…ark.sql.redaction.string.regex ### What changes were proposed in this pull request? This PR fixes an issue that ThriftServer doesn't recognize `spark.sql.redaction.string.regex`. The problem is that sensitive information included in queries can be exposed.   ### Why are the changes needed? Bug fix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Ran ThriftServer, connect to it and execute `CREATE TABLE mytbl2(a int) OPTIONS(url="jdbc:mysql//example.com:3306", driver="com.mysql.jdbc.Driver", dbtable="test_tbl", user="test_usr", password="abcde");` with `spark.sql.redaction.string.regex=((?i)(?<=password=))(".*")|('.*')` Then, confirmed UI.   Closes #33743 from sarutak/thrift-redact. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com> (cherry picked from commit b914ff7) Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>
…ark.sql.redaction.string.regex ### What changes were proposed in this pull request? This PR fixes an issue that ThriftServer doesn't recognize `spark.sql.redaction.string.regex`. The problem is that sensitive information included in queries can be exposed.   ### Why are the changes needed? Bug fix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Ran ThriftServer, connect to it and execute `CREATE TABLE mytbl2(a int) OPTIONS(url="jdbc:mysql//example.com:3306", driver="com.mysql.jdbc.Driver", dbtable="test_tbl", user="test_usr", password="abcde");` with `spark.sql.redaction.string.regex=((?i)(?<=password=))(".*")|('.*')` Then, confirmed UI.   Closes #33743 from sarutak/thrift-redact. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com> (cherry picked from commit b914ff7) Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>
| logInfo(s"Submitting query '$statement' with $statementId") | ||
| val redactedStatement = SparkUtils.redact(sqlContext.conf.stringRedactionPattern, statement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sarutak it would be great to also log the redactedStatement in the logInfo one line above, otherwise the credentials needlessly leak to log4j logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, let me open a followup PR. Thanks.
…ation in UI by config ### What changes were proposed in this pull request? This PR adds a test for SPARK-36400 (#33743). ### Why are the changes needed? SPARK-36512 (#33741) was fixed so we can add this test now. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? New test. Closes #33885 from sarutak/add-reduction-test. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>
…ark.sql.redaction.string.regex ### What changes were proposed in this pull request? This PR fixes an issue that ThriftServer doesn't recognize `spark.sql.redaction.string.regex`. The problem is that sensitive information included in queries can be exposed.   ### Why are the changes needed? Bug fix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Ran ThriftServer, connect to it and execute `CREATE TABLE mytbl2(a int) OPTIONS(url="jdbc:mysql//example.com:3306", driver="com.mysql.jdbc.Driver", dbtable="test_tbl", user="test_usr", password="abcde");` with `spark.sql.redaction.string.regex=((?i)(?<=password=))(".*")|('.*')` Then, confirmed UI.   Closes apache#33743 from sarutak/thrift-redact. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com> (cherry picked from commit b914ff7) Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>
What changes were proposed in this pull request?
This PR fixes an issue that ThriftServer doesn't recognize


spark.sql.redaction.string.regex.The problem is that sensitive information included in queries can be exposed.
Why are the changes needed?
Bug fix.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Ran ThriftServer, connect to it and execute
CREATE TABLE mytbl2(a int) OPTIONS(url="jdbc:mysql//example.com:3306", driver="com.mysql.jdbc.Driver", dbtable="test_tbl", user="test_usr", password="abcde");withspark.sql.redaction.string.regex=((?i)(?<=password=))(".*")|('.*')Then, confirmed UI.