[Streaming][DOC] Fix typo & formatting for JavaDoc #22593

niofire · 2018-09-29T23:10:30Z

What changes were proposed in this pull request?

Fixed typo for function outputMode
- OutputMode.Complete(), changed these is some updates to there are some updates
Replaced hyphenized list by HTML unordered list tags in comments to fix the Javadoc documentation.

Current render from most recent Spark API Docs:

outputMode(OutputMode) - List formatted as a prose.

outputMode(String) - List formatted as a prose.

partitionBy(String*) - List formatted as a prose.

How was this patch tested?

This PR contains a document patch ergo no functional testing is required.

HyukjinKwon · 2018-09-30T05:13:41Z

ok to test

SparkQA · 2018-09-30T05:19:38Z

Test build #96805 has finished for PR 22593 at commit c1a89f0.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-10-01T05:18:32Z

sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala

-   *   - `OutputMode.Update()`: only the rows that were updated in the streaming DataFrame/Dataset
+   * <ul>
+   *   <li> `OutputMode.Append()`: only the new rows in the streaming DataFrame/Dataset will be
+   *                            written to the sink.</li>


I would just format this similarly with

spark/sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala

Lines 338 to 366 in e06da95

* <ul>

* <li>`primitivesAsString` (default `false`): infers all primitive values as a string type</li>

* <li>`prefersDecimal` (default `false`): infers all floating-point values as a decimal

* type. If the values do not fit in decimal, then it infers them as doubles.</li>

* <li>`allowComments` (default `false`): ignores Java/C++ style comment in JSON records</li>

* <li>`allowUnquotedFieldNames` (default `false`): allows unquoted JSON field names</li>

* <li>`allowSingleQuotes` (default `true`): allows single quotes in addition to double quotes

* </li>

* <li>`allowNumericLeadingZeros` (default `false`): allows leading zeros in numbers

* (e.g. 00012)</li>

* <li>`allowBackslashEscapingAnyCharacter` (default `false`): allows accepting quoting of all

* character using backslash quoting mechanism</li>

* <li>`allowUnquotedControlChars` (default `false`): allows JSON Strings to contain unquoted

* control characters (ASCII characters with value less than 32, including tab and line feed

* characters) or not.</li>

* <li>`mode` (default `PERMISSIVE`): allows a mode for dealing with corrupt records

* during parsing.

* <ul>

* <li>`PERMISSIVE` : when it meets a corrupted record, puts the malformed string into a

* field configured by `columnNameOfCorruptRecord`, and sets other fields to `null`. To

* keep corrupt records, an user can set a string type field named

* `columnNameOfCorruptRecord` in an user-defined schema. If a schema does not have the

* field, it drops corrupt records during parsing. When inferring a schema, it implicitly

* adds a `columnNameOfCorruptRecord` field in an output schema.</li>

* <li>`DROPMALFORMED` : ignores the whole corrupted records.</li>

* <li>`FAILFAST` : throws an exception when it meets corrupted records.</li>

* </ul>

* </li>

* <li>`columnNameOfCorruptRecord` (default is the value specified in

HyukjinKwon · 2018-10-04T02:02:03Z

ping @niofire

srowen · 2018-10-04T10:20:25Z

This is fine, although unfortunately there are lots of instances of this type of formatting error. We don't need to fix them all, but what about looking for similar instances in Column and Dataset at least? I can see a few.

srowen · 2018-10-09T15:52:00Z

Ping @niofire to update or close

niofire · 2018-10-10T15:24:15Z

@srowen Guessing you're referring to some other files? Could you link?

SparkQA · 2018-10-10T15:25:01Z

Test build #97204 has finished for PR 22593 at commit f356cf3.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2018-10-10T15:36:10Z

Have a look at Column.scala and Dataset.scala in org.apache.spark.sql. But, on a second look, this is how I see the lists render:

That already seems fine. Are you looking at javadoc for these classes? scaladoc looks OK.

niofire · 2018-10-10T16:00:42Z

From https://spark.apache.org/docs/2.3.2/api/java/org/apache/spark/sql/streaming/DataStreamWriter.html

I'm guessing this is referring to the Java API - Is there some java source file that contains those comments, are is the scala comments mirrored to the Java API documentation?

srowen · 2018-10-10T16:21:07Z

Although it's the Scala API, it's callable from Java just as well. There's no Java-specific API here. So, yeah, actually it makes sense to have javadoc and scaladoc for this. And I think there are probably many markdown-like things that work in scaladoc not in javadoc. While you're welcome to fix all of it ideally, it may be way too much. OK, I'd say go ahead but at least fix all the instances of bullet lists in the .sql package. It's easy enough to search for something like \*\s+- as a regex to spot instances.

SparkQA · 2018-10-10T17:05:01Z

Test build #97205 has finished for PR 22593 at commit 0620ede.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

niofire · 2018-10-11T14:55:17Z

Seems like this wasn't able to build due to a transient error with CI

niofire · 2018-10-11T14:56:45Z

org.apache.spark.sql.hive.client.HiveClientSuites.(It is not a test it is a sbt.testing.SuiteSelector)
-> Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

HyukjinKwon · 2018-10-11T15:03:02Z

retest this please

SparkQA · 2018-10-11T15:55:44Z

Test build #97268 has finished for PR 22593 at commit 37b4684.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2018-10-11T16:21:18Z

Hm, that's a weird error. Big javadoc failures from unrelated classes. This looks like errors you get when you run javadoc on translated Scala classes. No idea why it's just popped up. Let me run again.

[error] /home/jenkins/workspace/SparkPullRequestBuilder/core/target/java/org/apache/spark/serializer/SerializationDebugger.java:159: error: cannot find symbol
[error]   static private  org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassReflection reflect ()  { throw new RuntimeException(); }
[error]                                                                    ^
[error]   symbol:   class ObjectStreamClassReflection
[error]   location: class SerializationDebugger
[error] /home/jenkins/workspace/SparkPullRequestBuilder/core/target/java/org/apache/spark/serializer/SerializationDebugger.java:22: error: class SerializationDebugger is already defined in package org.apache.spark.serializer
[error]   static private  class SerializationDebugger {
[error]                   ^
[error] /home/jenkins/workspace/SparkPullRequestBuilder/core/target/java/org/apache/spark/util/UtilsSuite.java:4: error: modifier static not allowed here
[error]     static public  class MalformedClass {
[error]                    ^

SparkQA · 2018-10-11T16:40:37Z

Test build #4371 has finished for PR 22593 at commit 37b4684.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2018-10-11T17:22:31Z

Huh, I'm really confused why this would fail, or at least, start failing right now. We use these HTML tags elsewhere. You could try updating the unidoc plugin version, but I think it's a genjavadoc issue if anything. But we're on the latest genjavadoc. Really confused!

niofire · 2018-10-11T17:28:14Z

Is this happening for other PRs as well?

HyukjinKwon · 2018-10-11T17:33:22Z

Usually no.. i only saw this once in Tdas's PR before - at that time the order of methods in the change was related

HyukjinKwon · 2018-10-11T17:35:10Z

sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala


  /**
+   * <ul>
   * Specifies the behavior when data or table already exists. Options include:


This one looks wrongly formatted btw. Looks ul should be below of this line

Crap yeah, will move it down

SparkQA · 2018-10-11T18:39:41Z

Test build #97265 has finished for PR 22593 at commit 0620ede.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-10-12T02:08:28Z

Also, let's mention this PR targets to fix javadoc in the PR description and/or title.

SparkQA · 2018-10-12T19:05:55Z

Test build #97311 has finished for PR 22593 at commit d7487c5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2018-10-12T19:09:42Z

Merged to master/2.4

## What changes were proposed in this pull request? - Fixed typo for function outputMode - OutputMode.Complete(), changed `these is some updates` to `there are some updates` - Replaced hyphenized list by HTML unordered list tags in comments to fix the Javadoc documentation. Current render from most recent [Spark API Docs](https://spark.apache.org/docs/2.3.1/api/java/org/apache/spark/sql/streaming/DataStreamWriter.html): #### outputMode(OutputMode) - List formatted as a prose. ![image](https://user-images.githubusercontent.com/2295469/46250648-11086700-c3f4-11e8-8a5a-d88b079c165d.png) #### outputMode(String) - List formatted as a prose. ![image](https://user-images.githubusercontent.com/2295469/46250651-24b3cd80-c3f4-11e8-9dac-ae37599afbce.png) #### partitionBy(String*) - List formatted as a prose. ![image](https://user-images.githubusercontent.com/2295469/46250655-36957080-c3f4-11e8-990b-47bd612d3c51.png) ## How was this patch tested? This PR contains a document patch ergo no functional testing is required. Closes #22593 from niofire/fix-typo-datastreamwriter. Authored-by: Mathieu St-Louis <mastloui@microsoft.com> Signed-off-by: Sean Owen <sean.owen@databricks.com> (cherry picked from commit 4e141a4) Signed-off-by: Sean Owen <sean.owen@databricks.com>

## What changes were proposed in this pull request? - Fixed typo for function outputMode - OutputMode.Complete(), changed `these is some updates` to `there are some updates` - Replaced hyphenized list by HTML unordered list tags in comments to fix the Javadoc documentation. Current render from most recent [Spark API Docs](https://spark.apache.org/docs/2.3.1/api/java/org/apache/spark/sql/streaming/DataStreamWriter.html): #### outputMode(OutputMode) - List formatted as a prose. ![image](https://user-images.githubusercontent.com/2295469/46250648-11086700-c3f4-11e8-8a5a-d88b079c165d.png) #### outputMode(String) - List formatted as a prose. ![image](https://user-images.githubusercontent.com/2295469/46250651-24b3cd80-c3f4-11e8-9dac-ae37599afbce.png) #### partitionBy(String*) - List formatted as a prose. ![image](https://user-images.githubusercontent.com/2295469/46250655-36957080-c3f4-11e8-990b-47bd612d3c51.png) ## How was this patch tested? This PR contains a document patch ergo no functional testing is required. Closes apache#22593 from niofire/fix-typo-datastreamwriter. Authored-by: Mathieu St-Louis <mastloui@microsoft.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>

Fix typo + replaced hyphens with html lists in DataStreamWriter

c1a89f0

niofire changed the title ~~Fix typo & format in DataStreamWriter.scala~~ [SQL][DOC] Fix typo & format in DataStreamWriter.scala Sep 29, 2018

niofire changed the title ~~[SQL][DOC] Fix typo & format in DataStreamWriter.scala~~ [Streaming][DOC] Fix typo & format in DataStreamWriter.scala Sep 29, 2018

HyukjinKwon reviewed Oct 1, 2018

View reviewed changes

Fixed commenting format

f356cf3

Remove whitespace at the end of line

0620ede

Changed hyphenized lists to use html lists

37b4684

HyukjinKwon reviewed Oct 11, 2018

View reviewed changes

niofire changed the title ~~[Streaming][DOC] Fix typo & format in DataStreamWriter.scala~~ [Streaming][DOC] Fix typo & formatting for JavaDoc Oct 12, 2018

Reformat list

d7487c5

asfgit closed this in 4e141a4 Oct 12, 2018

	* <ul>
	* <li>`primitivesAsString` (default `false`): infers all primitive values as a string type</li>
	* <li>`prefersDecimal` (default `false`): infers all floating-point values as a decimal
	* type. If the values do not fit in decimal, then it infers them as doubles.</li>
	* <li>`allowComments` (default `false`): ignores Java/C++ style comment in JSON records</li>
	* <li>`allowUnquotedFieldNames` (default `false`): allows unquoted JSON field names</li>
	* <li>`allowSingleQuotes` (default `true`): allows single quotes in addition to double quotes
	* </li>
	* <li>`allowNumericLeadingZeros` (default `false`): allows leading zeros in numbers
	* (e.g. 00012)</li>
	* <li>`allowBackslashEscapingAnyCharacter` (default `false`): allows accepting quoting of all
	* character using backslash quoting mechanism</li>
	* <li>`allowUnquotedControlChars` (default `false`): allows JSON Strings to contain unquoted
	* control characters (ASCII characters with value less than 32, including tab and line feed
	* characters) or not.</li>
	* <li>`mode` (default `PERMISSIVE`): allows a mode for dealing with corrupt records
	* during parsing.
	* <ul>
	* <li>`PERMISSIVE` : when it meets a corrupted record, puts the malformed string into a
	* field configured by `columnNameOfCorruptRecord`, and sets other fields to `null`. To
	* keep corrupt records, an user can set a string type field named
	* `columnNameOfCorruptRecord` in an user-defined schema. If a schema does not have the
	* field, it drops corrupt records during parsing. When inferring a schema, it implicitly
	* adds a `columnNameOfCorruptRecord` field in an output schema.</li>
	* <li>`DROPMALFORMED` : ignores the whole corrupted records.</li>
	* <li>`FAILFAST` : throws an exception when it meets corrupted records.</li>
	* </ul>
	* </li>
	* <li>`columnNameOfCorruptRecord` (default is the value specified in

[Streaming][DOC] Fix typo & formatting for JavaDoc #22593

[Streaming][DOC] Fix typo & formatting for JavaDoc #22593

Uh oh!

Conversation

niofire commented Sep 29, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

outputMode(OutputMode) - List formatted as a prose.

outputMode(String) - List formatted as a prose.

partitionBy(String*) - List formatted as a prose.

How was this patch tested?

Uh oh!

HyukjinKwon commented Sep 30, 2018

Uh oh!

SparkQA commented Sep 30, 2018

Uh oh!

HyukjinKwon Oct 1, 2018

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon commented Oct 4, 2018

Uh oh!

srowen commented Oct 4, 2018

Uh oh!

srowen commented Oct 9, 2018

Uh oh!

niofire commented Oct 10, 2018

Uh oh!

SparkQA commented Oct 10, 2018

Uh oh!

srowen commented Oct 10, 2018

Uh oh!

niofire commented Oct 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

srowen commented Oct 10, 2018

Uh oh!

SparkQA commented Oct 10, 2018

Uh oh!

niofire commented Oct 11, 2018

Uh oh!

niofire commented Oct 11, 2018

Uh oh!

HyukjinKwon commented Oct 11, 2018

Uh oh!

SparkQA commented Oct 11, 2018

Uh oh!

srowen commented Oct 11, 2018

Uh oh!

SparkQA commented Oct 11, 2018

Uh oh!

srowen commented Oct 11, 2018

Uh oh!

niofire commented Oct 11, 2018

Uh oh!

HyukjinKwon commented Oct 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon Oct 11, 2018

Choose a reason for hiding this comment

Uh oh!

niofire Oct 11, 2018

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 11, 2018

Uh oh!

HyukjinKwon commented Oct 12, 2018

Uh oh!

SparkQA commented Oct 12, 2018

Uh oh!

srowen commented Oct 12, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

niofire commented Sep 29, 2018 •

edited

Loading

niofire commented Oct 10, 2018 •

edited

Loading

HyukjinKwon commented Oct 11, 2018 •

edited

Loading