-
Notifications
You must be signed in to change notification settings - Fork 75
[NSE-273] Spark shim layer infrastructure #361
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the common layer is good to me
i think we should target for 3.1.2/3.2.0 in the shim layer
@@ -45,11 +45,15 @@ | |||
<arrow.install.dir>${arrow.script.dir}/build/arrow_install</arrow.install.dir> | |||
<arrow_root>/usr/local</arrow_root> | |||
<build_protobuf>ON</build_protobuf> | |||
<project.prefix>spark-sql-columnar</project.prefix> | |||
<project.name.prefix>OAP Project Spark Columnar Plugin</project.name.prefix> | |||
<spark311.version>3.1.1</spark311.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should already supported 3.1.1, should use 3.1.2 or 3.2.0-snapshot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I put the initial shim layer for Spark 3.1.1 here because when you need to add a shim for Spark 3.1.2, that is for the reason Spark 3.1.2 is different on at least one api aspect with Spark 3.1.1. At that time you will need to create a shim for both Spark 3.1.1 and Spark 3.1.2. So Spark 3.1.1 shim will be always needed when you have a need to create a shim for another newer version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, Spark 3.1.1 layer will also act as a template when you need to add a new one.
shims/aggregator/pom.xml
Outdated
<!-- dependencies are always listed in sorted order by groupId, artifactId --> | ||
<dependency> | ||
<groupId>com.intel.oap.</groupId> | ||
<artifactId>${project.prefix}-shims-spark311</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like this is not implemented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a default implementation (empty implementation) now. So you have no shim API to call other than getShimDescriptor. The developer team will need to identify add needed API to common/SparkShims interface and implement in spark311 shims.
There is typo there, "com.intel.oap." -> com.intel.oap". I will correct.
@jerrychenhf
|
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
What changes were proposed in this pull request?
We implement Spark shim layer infrastructure for defining the common Spark shim interface, implementing shims for specific Spark versions, and the mechanisms to load the proper shim layer based on Spark versions.
How was this patch tested?
Unit test code that tests the loading of a right shim implementation based on the current Spark version.