Skip to content

Conversation

@tdas
Copy link
Contributor

@tdas tdas commented Dec 15, 2016

What changes were proposed in this pull request?

  • Extended the Window operation section with code snippet and explanation of watermarking
  • Extended the Output Mode section with a table showing the compatibility between query type and output mode
  • Rewrote the Monitoring section with updated jsons generated by StreamingQuery.progress/status
  • Updated API changes in the StreamingQueryListener example

TODO

  • Figure showing the watermarking

How was this patch tested?

N/A

Screenshots

Section: Windowed Aggregation with Event Time

screen shot 2016-12-15 at 3 33 10 pm

image

screen shot 2016-12-15 at 3 33 46 pm


Section: Output Modes

image


Section: Monitoring

image
image

@SparkQA
Copy link

SparkQA commented Dec 15, 2016

Test build #70182 has started for PR 16294 at commit ed8d9e0.

@tdas tdas changed the title [WIP][SPARK-18669][SS][DOCS] Update Apache docs for Structured Streaming regarding watermarking and status [SPARK-18669][SS][DOCS] Update Apache docs for Structured Streaming regarding watermarking and status Dec 15, 2016
@SparkQA
Copy link

SparkQA commented Dec 16, 2016

Test build #70218 has finished for PR 16294 at commit 1566433.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

<div data-lang="scala" markdown="1">

{% highlight scala %}
import spark.implicits._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, will there be a complete example in the examples folder? In documents like ML, SQL, the code is cited from the example file instead of hard code in the document. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasnt planning to adding examples in this PR to keep this just about docs. I am happy to see simple examples contributed as PRs from the community.

@dongjoon-hyun
Copy link
Member

Hi, @tdas .

Is it possible for you to update this statement for Apache Spark 2.1?

Spark 2.0 is the ALPHA RELEASE of Structured Streaming and the APIs are still experimental. In this guide, we are going to walk you through~

https://github.com/apache/spark/blame/master/docs/structured-streaming-programming-guide.md#L13

@tdas
Copy link
Contributor Author

tdas commented Dec 21, 2016

will do. i am rewriting parts of this PR right now.

@SparkQA
Copy link

SparkQA commented Dec 22, 2016

Test build #70500 has finished for PR 16294 at commit 8f8c11d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 22, 2016

Test build #70525 has finished for PR 16294 at commit b96351c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 22, 2016

Test build #70524 has finished for PR 16294 at commit a4a93aa.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tdas
Copy link
Contributor Author

tdas commented Dec 22, 2016

^^ The last failure on build 70524 is for a older commit. I reverted the culprit change and build 70525 passed.

Copy link
Member

@zsxwing zsxwing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Just some nits.

}
@Overrides void onQueryProgress(QueryProgressEvent queryProgress) {
System.out.println("Query made progress: " + queryProgress.queryStatus);
System.out.println("Query made progress: " + queryProgress.progress);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: queryProgress.progress()

}
@Overrides void onQueryTerminated(QueryTerminatedEvent queryTerminated) {
System.out.println("Query terminated: " + queryTerminated.queryStatus.name);
System.out.println("Query terminated: " + queryTerminated.id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: queryTerminated.id()


@Overrides void onQueryStarted(QueryStartedEvent queryStarted) {
System.out.println("Query started: " + queryTerminated.queryStatus.name);
System.out.println("Query started: " + queryTerminated.id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: queryStarted.id()


override def onQueryStarted(queryStarted: QueryStartedEvent): Unit = {
println("Query started: " + queryTerminated.queryStatus.name)
println("Query started: " + queryTerminated.id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: queryTerminated -> queryStarted

{% highlight python %}
query = ... // a StreamingQuery
query = ... # a StreamingQuery
print(query.progress)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: lastProgress

latency.getBatch.source: 20
Sink status - MySink
Committed offsets: [1, -]
System.out.println(query.status);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: query.status()

StreamingQuery query = ...

System.out.println(query.status);
System.out.println(query.progress);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: query.lastProgress()

{% highlight scala %}
val query: StreamingQuery = ...

println(query.progress)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: query.lastProgress

preserves all data in the Result Table.
</td>
</tr>
<tr>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty row?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same reason as the other place.

</tr>
</tr>
<tr>
<td></td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty row?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty row to make sure that there is line after the table when rendered. otherwise looks odd.

}
@Overrides void onQueryProgress(QueryProgressEvent queryProgress) {
System.out.println("Query made progress: " + queryProgress.progress);
System.out.println("Query made progress: " + queryProgress.lastProgress());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the method is QueryProgressEvent.progress

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. replace fail.

@SparkQA
Copy link

SparkQA commented Dec 23, 2016

Test build #70529 has finished for PR 16294 at commit 0cc1dd6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Dec 23, 2016

LGTM pending tests

@SparkQA
Copy link

SparkQA commented Dec 23, 2016

Test build #70530 has finished for PR 16294 at commit 576b432.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Dec 28, 2016
…egarding watermarking and status

## What changes were proposed in this pull request?

- Extended the Window operation section with code snippet and explanation of watermarking
- Extended the Output Mode section with a table showing the compatibility between query type and output mode
- Rewrote the Monitoring section with updated jsons generated by StreamingQuery.progress/status
- Updated API changes in the StreamingQueryListener example

TODO
- [x] Figure showing the watermarking

## How was this patch tested?

N/A

## Screenshots
### Section: Windowed Aggregation with Event Time

<img width="927" alt="screen shot 2016-12-15 at 3 33 10 pm" src="https://cloud.githubusercontent.com/assets/663212/21246197/0e02cb1a-c2dc-11e6-8816-0cd28d8201d7.png">

![image](https://cloud.githubusercontent.com/assets/663212/21246241/45b0f87a-c2dc-11e6-9c29-d0a89e07bf8d.png)

<img width="929" alt="screen shot 2016-12-15 at 3 33 46 pm" src="https://cloud.githubusercontent.com/assets/663212/21246202/1652cefa-c2dc-11e6-8c64-3c05977fb3fc.png">

----------------------------
### Section: Output Modes
![image](https://cloud.githubusercontent.com/assets/663212/21246276/8ee44948-c2dc-11e6-9fa2-30502fcf9a55.png)

----------------------------
### Section: Monitoring
![image](https://cloud.githubusercontent.com/assets/663212/21246535/3c5baeb2-c2de-11e6-88cd-ca71db7c5cf9.png)
![image](https://cloud.githubusercontent.com/assets/663212/21246574/789492c2-c2de-11e6-8471-7bef884e1837.png)

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes #16294 from tdas/SPARK-18669.

(cherry picked from commit 092c672)
Signed-off-by: Shixiong Zhu <shixiong@databricks.com>
@asfgit asfgit closed this in 092c672 Dec 28, 2016
asfgit pushed a commit to apache/spark-website that referenced this pull request Dec 28, 2016
This version is built from the docs source code generated by applying apache/spark#16294 to v2.1.0 (so, other changes in branch 2.1 will not affect the doc).
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Dec 29, 2016
…egarding watermarking and status

## What changes were proposed in this pull request?

- Extended the Window operation section with code snippet and explanation of watermarking
- Extended the Output Mode section with a table showing the compatibility between query type and output mode
- Rewrote the Monitoring section with updated jsons generated by StreamingQuery.progress/status
- Updated API changes in the StreamingQueryListener example

TODO
- [x] Figure showing the watermarking

## How was this patch tested?

N/A

## Screenshots
### Section: Windowed Aggregation with Event Time

<img width="927" alt="screen shot 2016-12-15 at 3 33 10 pm" src="https://cloud.githubusercontent.com/assets/663212/21246197/0e02cb1a-c2dc-11e6-8816-0cd28d8201d7.png">

![image](https://cloud.githubusercontent.com/assets/663212/21246241/45b0f87a-c2dc-11e6-9c29-d0a89e07bf8d.png)

<img width="929" alt="screen shot 2016-12-15 at 3 33 46 pm" src="https://cloud.githubusercontent.com/assets/663212/21246202/1652cefa-c2dc-11e6-8c64-3c05977fb3fc.png">

----------------------------
### Section: Output Modes
![image](https://cloud.githubusercontent.com/assets/663212/21246276/8ee44948-c2dc-11e6-9fa2-30502fcf9a55.png)

----------------------------
### Section: Monitoring
![image](https://cloud.githubusercontent.com/assets/663212/21246535/3c5baeb2-c2de-11e6-88cd-ca71db7c5cf9.png)
![image](https://cloud.githubusercontent.com/assets/663212/21246574/789492c2-c2de-11e6-8471-7bef884e1837.png)

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes apache#16294 from tdas/SPARK-18669.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…egarding watermarking and status

## What changes were proposed in this pull request?

- Extended the Window operation section with code snippet and explanation of watermarking
- Extended the Output Mode section with a table showing the compatibility between query type and output mode
- Rewrote the Monitoring section with updated jsons generated by StreamingQuery.progress/status
- Updated API changes in the StreamingQueryListener example

TODO
- [x] Figure showing the watermarking

## How was this patch tested?

N/A

## Screenshots
### Section: Windowed Aggregation with Event Time

<img width="927" alt="screen shot 2016-12-15 at 3 33 10 pm" src="https://cloud.githubusercontent.com/assets/663212/21246197/0e02cb1a-c2dc-11e6-8816-0cd28d8201d7.png">

![image](https://cloud.githubusercontent.com/assets/663212/21246241/45b0f87a-c2dc-11e6-9c29-d0a89e07bf8d.png)

<img width="929" alt="screen shot 2016-12-15 at 3 33 46 pm" src="https://cloud.githubusercontent.com/assets/663212/21246202/1652cefa-c2dc-11e6-8c64-3c05977fb3fc.png">

----------------------------
### Section: Output Modes
![image](https://cloud.githubusercontent.com/assets/663212/21246276/8ee44948-c2dc-11e6-9fa2-30502fcf9a55.png)

----------------------------
### Section: Monitoring
![image](https://cloud.githubusercontent.com/assets/663212/21246535/3c5baeb2-c2de-11e6-88cd-ca71db7c5cf9.png)
![image](https://cloud.githubusercontent.com/assets/663212/21246574/789492c2-c2de-11e6-8471-7bef884e1837.png)

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes apache#16294 from tdas/SPARK-18669.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants