Skip to content

Conversation

@maryannxue
Copy link
Contributor

What changes were proposed in this pull request?

Add a JDBC Option "pushDownPredicate" (default true) to allow/disallow predicate push-down in JDBC data source.

How was this patch tested?

Add a test in JDBCSuite

@maryannxue
Copy link
Contributor Author

@gatorsmile @TomaszGaweda

@TomaszGaweda
Copy link

TomaszGaweda commented Jul 25, 2018

Thanks! :) LGTM except two minor comments

val sessionInitStatement = parameters.get(JDBC_SESSION_INIT_STATEMENT)

// An option to allow/disallow pushing down predicate into JDBC data source
val pushDownPredicate = parameters.getOrElse(JDBC_PUSHDOWN_PREDICATE, "true").toBoolean

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super Nit: Shouldn't it be in plural, pushDownPredicates? There may be many predicates

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or one could argue that "predicate" is a notion of all filters as a whole. It's a nice reminder though. I had not thought about it, but anyway I just checked: we use PushDownPredicate and the singular form in similar rules. So maybe we keep it singular here too?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, consistency is a very good argument :) Indeed it will be plural or not, depending from which side we are looking at it

// Check if JDBCRDD.compileFilter can accept input filters
override def unhandledFilters(filters: Array[Filter]): Array[Filter] = {
filters.filter(JDBCRDD.compileFilter(_, JdbcDialects.get(jdbcOptions.url)).isEmpty)
if (jdbcOptions.pushDownPredicate) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure, that this is the only place? JDBCRDD.scanTable defines filters as all filters that may be pushed down. Probably we should use filters -- unhandledFilters in JdbcRelation.buildScan

Copy link
Contributor Author

@maryannxue maryannxue Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the only source of truth for defining handled/unhandled. The caller (physical rules) calls this method and push "handled" to JdbcRelation.buildScan.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now, my mistake. Thanks for clarification :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I share your opinion actually. It is confusing here... maybe we should change the parameter names at some point.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed it's confusing. buildScan argument may be named pushedFilters, variables also, then code will be self-describing

@rxin
Copy link
Contributor

rxin commented Jul 25, 2018

Can you add JDBC to the title?

@maryannxue maryannxue changed the title [SPARK-24288][SQL] Enable preventing predicate pushdown [SPARK-24288][SQL] Add a JDBC Option to enable preventing predicate pushdown Jul 25, 2018
@SparkQA
Copy link

SparkQA commented Jul 25, 2018

Test build #93562 has finished for PR 21875 at commit a83b64b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor

@maryannxue We need to update the programming guide ?

@HyukjinKwon
Copy link
Member

@maryannxue
Copy link
Contributor Author

Programming guide updated. Thank you, @dilipbiswal and @HyukjinKwon!

@dilipbiswal
Copy link
Contributor

dilipbiswal commented Jul 26, 2018

@maryannxue Thanks. It looks good to me. As a minor comment, could we state the default value for this parameter as well ? For some of the other parameters, we specify the default value.

@SparkQA
Copy link

SparkQA commented Jul 26, 2018

Test build #93579 has finished for PR 21875 at commit 027b6c4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 26, 2018

Test build #93584 has finished for PR 21875 at commit 6884ed6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 27, 2018

Test build #93647 has finished for PR 21875 at commit 6884ed6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gatorsmile
Copy link
Member

Thanks! Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants