-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27064][SS] create StreamingWrite at the beginning of streaming execution #23981
Conversation
@@ -585,7 +585,7 @@ abstract class StreamExecution( | |||
options: Map[String, String], | |||
inputPlan: LogicalPlan): StreamingWrite = { | |||
val writeBuilder = table.newWriteBuilder(new DataSourceOptions(options.asJava)) | |||
.withQueryId(runId.toString) | |||
.withQueryId(id.toString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an unrelated change, but it was obviously a mistake: the sink doesn't care about the runId
which gets changed after query restart. The sink needs the id
which is reliable across the life cycle of the stream query.
No builtin streaming sinks use this id, so this is for future-proof.
Test build #103059 has finished for PR 23981 at commit
|
Test build #103078 has finished for PR 23981 at commit
|
retest this please |
Test build #103092 has finished for PR 23981 at commit
|
LGTM. |
retest this please |
Test build #103412 has finished for PR 23981 at commit
|
retest this please |
Test build #103426 has finished for PR 23981 at commit
|
thanks, merging to master! |
… execution ## What changes were proposed in this pull request? According to the [design](https://docs.google.com/document/d/1vI26UEuDpVuOjWw4WPoH2T6y8WAekwtI7qoowhOFnI4/edit?usp=sharing), the life cycle of `StreamingWrite` should be the same as the read side `MicroBatch/ContinuousStream`, i.e. each run of the stream query, instead of each epoch. This PR fixes it. ## How was this patch tested? existing tests Closes apache#23981 from cloud-fan/dsv2. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
According to the design, the life cycle of
StreamingWrite
should be the same as the read sideMicroBatch/ContinuousStream
, i.e. each run of the stream query, instead of each epoch.This PR fixes it.
How was this patch tested?
existing tests