-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE] Add InputIteratorTransformer to decouple ReadRel and iterator index #3854
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI |
2 similar comments
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
a31f6d8
to
5348287
Compare
Run Gluten Clickhouse CI |
2 similar comments
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
backends-velox/src/main/scala/io/glutenproject/backendsapi/velox/MetricsApiImpl.scala
Outdated
Show resolved
Hide resolved
Run Gluten Clickhouse CI |
7565873
to
28f4e87
Compare
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
9899965
to
3b665d3
Compare
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
/Benchmark Velox |
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
cc @Yohahaha @zhztheplayer @zhouyuan @zzcclp thank you |
cc @rui-mo |
sparkContext: SparkContext): Map[String, SQLMetric] = { | ||
Map( | ||
"cpuCount" -> SQLMetrics.createMetric(sparkContext, "cpu wall time count"), | ||
"wallNanos" -> SQLMetrics.createNanoTimingMetric(sparkContext, "totaltime of batch scan"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it accurate to use "totaltime of batch scan"? I thought the transformer is common for both scan and other inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I think it is copied from a wrong code path..
* `ReadRel` for the child columnar iterator, so that the [[TransformSupport]] always has input. It | ||
* would be transformed to `ValueStreamNode` at native side. | ||
*/ | ||
case class InputIteratorTransformer(child: SparkPlan) extends UnaryTransformSupport { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we combine InputIteratorTransformer
and ColumnarInputAdapter
later on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we do not need ColumnarInputAdapter
any more
Run Gluten Clickhouse CI |
LGTM! thank you for this great patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
Good patch. Sorry for the late review, if there is any issue for ch backend, I will fix later. |
What changes were proposed in this pull request?
This pr adds a new transformer
InputIteratorTransformer
to connect SparkPlan and TransformSupport. It replace the originalColumnarInputAdapter
to provide the substrait plan ReadRel for the child columnar iterator, so that the TransformSupport always has input. It would be transformed to ValueStreamNode at native Velox side.The reason is that:
How was this patch tested?
Pass CI and manually test