Skip to content

Conversation

@zhenlineo
Copy link
Contributor

@zhenlineo zhenlineo commented Feb 16, 2023

What changes were proposed in this pull request?

Implemented the basic Dataset#write API to allow users to write the df into tables, csv etc. files.

Why are the changes needed?

Basic write operation.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Integration tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is lazy. you need to call collect()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update annotation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this conversion? That seems like an artifact from the original implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pfff... this is clever. builder.hasPath != builder.hasTable might be easier to parse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update annotation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah that is a server side problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need more test coverage.

@zhenlineo zhenlineo changed the title [WIP][CONNECT] Scala Client Write API [WIP][CONNECT] Scala Client Write API V1 Feb 17, 2023
@zhenlineo zhenlineo changed the title [WIP][CONNECT] Scala Client Write API V1 [SPARK-42482][CONNECT] Scala Client Write API V1 Feb 17, 2023
@zhenlineo zhenlineo marked this pull request as ready for review February 17, 2023 21:24
Copy link
Contributor

@hvanhovell hvanhovell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

hvanhovell pushed a commit that referenced this pull request Feb 19, 2023
### What changes were proposed in this pull request?
Implemented the basic Dataset#write API to allow users to write the df into tables, csv etc. files.

### Why are the changes needed?
Basic write operation.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Integration tests.

Closes #40061 from zhenlineo/write.

Authored-by: Zhen Li <zhenlineo@users.noreply.github.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
(cherry picked from commit ede1a54)
Signed-off-by: Herman van Hovell <herman@databricks.com>
hvanhovell pushed a commit that referenced this pull request Feb 22, 2023
### What changes were proposed in this pull request?
Adding DataFrameWriterV2. This allows users to use the Dataset#writeTo API.

### Why are the changes needed?
Impls Dataset#writeTo

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
E2E

This is based on #40061

Closes #40075 from zhenlineo/write-v2.

Authored-by: Zhen Li <zhenlineo@users.noreply.github.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
hvanhovell pushed a commit that referenced this pull request Feb 22, 2023
### What changes were proposed in this pull request?
Adding DataFrameWriterV2. This allows users to use the Dataset#writeTo API.

### Why are the changes needed?
Impls Dataset#writeTo

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
E2E

This is based on #40061

Closes #40075 from zhenlineo/write-v2.

Authored-by: Zhen Li <zhenlineo@users.noreply.github.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
(cherry picked from commit 0c4645e)
Signed-off-by: Herman van Hovell <herman@databricks.com>
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
### What changes were proposed in this pull request?
Implemented the basic Dataset#write API to allow users to write the df into tables, csv etc. files.

### Why are the changes needed?
Basic write operation.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Integration tests.

Closes apache#40061 from zhenlineo/write.

Authored-by: Zhen Li <zhenlineo@users.noreply.github.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
(cherry picked from commit ede1a54)
Signed-off-by: Herman van Hovell <herman@databricks.com>
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
### What changes were proposed in this pull request?
Adding DataFrameWriterV2. This allows users to use the Dataset#writeTo API.

### Why are the changes needed?
Impls Dataset#writeTo

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
E2E

This is based on apache#40061

Closes apache#40075 from zhenlineo/write-v2.

Authored-by: Zhen Li <zhenlineo@users.noreply.github.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
(cherry picked from commit 0c4645e)
Signed-off-by: Herman van Hovell <herman@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants