-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-42482][CONNECT] Scala Client Write API V1 #40061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
connector/connect/client/jvm/src/main/java/org/apache/spark/sql/SaveMode.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is lazy. you need to call collect()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update annotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this conversion? That seems like an artifact from the original implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pfff... this is clever. builder.hasPath != builder.hasTable might be easier to parse.
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update annotation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah that is a server side problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need more test coverage.
This reverts commit 7e112b2.
hvanhovell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
### What changes were proposed in this pull request? Implemented the basic Dataset#write API to allow users to write the df into tables, csv etc. files. ### Why are the changes needed? Basic write operation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Integration tests. Closes #40061 from zhenlineo/write. Authored-by: Zhen Li <zhenlineo@users.noreply.github.com> Signed-off-by: Herman van Hovell <herman@databricks.com> (cherry picked from commit ede1a54) Signed-off-by: Herman van Hovell <herman@databricks.com>
### What changes were proposed in this pull request? Adding DataFrameWriterV2. This allows users to use the Dataset#writeTo API. ### Why are the changes needed? Impls Dataset#writeTo ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? E2E This is based on #40061 Closes #40075 from zhenlineo/write-v2. Authored-by: Zhen Li <zhenlineo@users.noreply.github.com> Signed-off-by: Herman van Hovell <herman@databricks.com>
### What changes were proposed in this pull request? Adding DataFrameWriterV2. This allows users to use the Dataset#writeTo API. ### Why are the changes needed? Impls Dataset#writeTo ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? E2E This is based on #40061 Closes #40075 from zhenlineo/write-v2. Authored-by: Zhen Li <zhenlineo@users.noreply.github.com> Signed-off-by: Herman van Hovell <herman@databricks.com> (cherry picked from commit 0c4645e) Signed-off-by: Herman van Hovell <herman@databricks.com>
### What changes were proposed in this pull request? Implemented the basic Dataset#write API to allow users to write the df into tables, csv etc. files. ### Why are the changes needed? Basic write operation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Integration tests. Closes apache#40061 from zhenlineo/write. Authored-by: Zhen Li <zhenlineo@users.noreply.github.com> Signed-off-by: Herman van Hovell <herman@databricks.com> (cherry picked from commit ede1a54) Signed-off-by: Herman van Hovell <herman@databricks.com>
### What changes were proposed in this pull request? Adding DataFrameWriterV2. This allows users to use the Dataset#writeTo API. ### Why are the changes needed? Impls Dataset#writeTo ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? E2E This is based on apache#40061 Closes apache#40075 from zhenlineo/write-v2. Authored-by: Zhen Li <zhenlineo@users.noreply.github.com> Signed-off-by: Herman van Hovell <herman@databricks.com> (cherry picked from commit 0c4645e) Signed-off-by: Herman van Hovell <herman@databricks.com>
What changes were proposed in this pull request?
Implemented the basic Dataset#write API to allow users to write the df into tables, csv etc. files.
Why are the changes needed?
Basic write operation.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Integration tests.