Skip to content

Conversation

@manuzhang
Copy link
Collaborator

No description provided.

@MrPowers
Copy link
Collaborator

MrPowers commented May 9, 2019

@manuzhang - When I saw the title of this PR, I thought it was going to add an array_sum function, like this: #80

I wasn't able to get my array_sum function working because I don't know how to import the aggregate function... Where is the aggregate function defined?

I am going to see if we can leverage any of the Spark 2.4 array functions to satisfy your use case in this PR.

@MrPowers
Copy link
Collaborator

MrPowers commented May 9, 2019

@manuzhang - Can you see if the array_union function will help here?

@manuzhang
Copy link
Collaborator Author

manuzhang commented May 10, 2019

It's more like collect_list plus flatten(also introduced in Spark 2.4) while array_union will remove duplicates. That said, I try not to use the new Spark 2.4 functions since we and, I guess, many others still use 2.3.x or lower versions.

@manuzhang
Copy link
Collaborator Author

By the way, the compatibility matrix in README needs update as 0.28.0 ~ 0.31.0 have been released

@MrPowers
Copy link
Collaborator

@manuzhang - Yea, I think it's a good idea to not rely on Spark 2.4 features, so I'm good with this PR.

Can we rename this from arraySum to arrayUnion or arrayUnionRetro? arraySum makes me think that the method is adding all the elements in the array. I'd like to stay consistent with the Spark 2.4 terminology. Let me know what you think!

@manuzhang
Copy link
Collaborator Author

manuzhang commented May 16, 2019

@MrPowers Sure, it's more like arrayConcat and Spark already has array_union in SQL which removes duplicates. Also, I'd like to see what @nvander1 has got first and how we can support more aggregate functions in the long run.

@MrPowers
Copy link
Collaborator

@manuzhang - I like calling it arrayConcat. I'm also in favor of just making this change and merging it in. I'd be good to give the community a good example of how to implement a user defined aggregate function. Actually, I changed my mind, I'm going to merge this in and then change the function name. I've already given you enough run-around on this one.

Thanks for continuing to help move this project in a good direction. I really appreciate the help.

@MrPowers MrPowers merged commit 97fb5c4 into mrpowers-io:master May 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants