-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22739][Catalyst] Additional Expression Support for Objects #21348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| cls: Class[_], | ||
| arguments: Seq[Expression], | ||
| dataType: DataType, | ||
| initializations: Seq[(String, Seq[Expression])] = Nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan, see here an implementation of the modifications you described previously.
|
Test build #90699 has finished for PR 21348 at commit
|
|
Test build #90700 has finished for PR 21348 at commit
|
|
Test build #90701 has finished for PR 21348 at commit
|
| override def children: Seq[Expression] = value :: Nil | ||
|
|
||
| override def eval(input: InternalRow): Any = | ||
| throw new UnsupportedOperationException("Only code-generated evaluation is supported.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any problem on implementing none code-generated evaluation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a pattern I'd seen for eval at the time of writing this PR. Is there another expression that has an example? I could refactor and find out.
|
The jira SPARK-25789 guide me here, thanks for @bdrillard your great job, we also meet the requirement on supporting dataset of avro during Structure Streaming. I'm adapting your code in databricks/spark-avro#217 to newer spark version, it seems work well. Could you give me credit to give a following up work of dataset of avro support? Look forward to your reply :) |
|
Hi @xuanyuanking, if you'd like to take on the work to fold my prior work in databricks/spark-avro#217 into Spark, that sounds good to me. Please do include me in pull-requests on this topic. Our work with an The parent ticket for this PR was previously marked as |
|
@bdrillard Thanks for our reply, of cause I'll request your review and make sure I understand your implement correctly. As avro has been included in Spark, maybe it's ok to just add these new-added expression in external? |
|
The PR #24299 proposes another solution to add support for Avro objects; your opinions matter, we are eager to have a working solution to create Avro datasets |
|
Hi @mazeboard, thanks for bringing your work to my attention. To clarify, this PR could be considered follow-up work for another PR addressing adding support for Dataset of arbitrary Avro Objects, see #22878 opened by @xuanyuanking. I'll take the time to read your request as well. It'd be good for us to compare/contrast the approaches in both #24299 and #22878. |
|
Can one of the admins verify this patch? |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
This PR is a working followup to the expression work begun in #20085. It provides necessary
Expressiondefinitions to support custom encoders (see this discussion in the Spark-Avro project).It adds the following expressions:
ObjectCast- performs explicit casting of anExpressionresult to aDataTypeStaticField- retrieves a static field against a class that otherwise has no accessor methodInstanceOf- anExpressionfor the JavainstanceofoperationModifies
NewInstanceto take a sequence of method-name and arguments initialization tuples, which are executed against the newly constructed object instance.Removes
InitializeJavaBean, as the generalizedNewInstancesubsumes its use-case.How was this patch tested?
Adds unit test for
NewInstancesupporting post-constructor initializations. All previous "JavaBean" tests were refactored to useNewInstance.Additional examples of working encoders that would use these new expressions can be seen in the Spark-Avro and Bunsen projects.