Skip to content

Conversation

@niofire
Copy link

@niofire niofire commented Sep 29, 2018

What changes were proposed in this pull request?

  • Fixed typo for function outputMode
    - OutputMode.Complete(), changed these is some updates to there are some updates
  • Replaced hyphenized list by HTML unordered list tags in comments to fix the Javadoc documentation.

Current render from most recent Spark API Docs:

outputMode(OutputMode) - List formatted as a prose.

image

outputMode(String) - List formatted as a prose.

image

partitionBy(String*) - List formatted as a prose.

image

How was this patch tested?

This PR contains a document patch ergo no functional testing is required.

@niofire niofire changed the title Fix typo & format in DataStreamWriter.scala [SQL][DOC] Fix typo & format in DataStreamWriter.scala Sep 29, 2018
@niofire niofire changed the title [SQL][DOC] Fix typo & format in DataStreamWriter.scala [Streaming][DOC] Fix typo & format in DataStreamWriter.scala Sep 29, 2018
@HyukjinKwon
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Sep 30, 2018

Test build #96805 has finished for PR 22593 at commit c1a89f0.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* - `OutputMode.Update()`: only the rows that were updated in the streaming DataFrame/Dataset
* <ul>
* <li> `OutputMode.Append()`: only the new rows in the streaming DataFrame/Dataset will be
* written to the sink.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just format this similarly with

* <ul>
* <li>`primitivesAsString` (default `false`): infers all primitive values as a string type</li>
* <li>`prefersDecimal` (default `false`): infers all floating-point values as a decimal
* type. If the values do not fit in decimal, then it infers them as doubles.</li>
* <li>`allowComments` (default `false`): ignores Java/C++ style comment in JSON records</li>
* <li>`allowUnquotedFieldNames` (default `false`): allows unquoted JSON field names</li>
* <li>`allowSingleQuotes` (default `true`): allows single quotes in addition to double quotes
* </li>
* <li>`allowNumericLeadingZeros` (default `false`): allows leading zeros in numbers
* (e.g. 00012)</li>
* <li>`allowBackslashEscapingAnyCharacter` (default `false`): allows accepting quoting of all
* character using backslash quoting mechanism</li>
* <li>`allowUnquotedControlChars` (default `false`): allows JSON Strings to contain unquoted
* control characters (ASCII characters with value less than 32, including tab and line feed
* characters) or not.</li>
* <li>`mode` (default `PERMISSIVE`): allows a mode for dealing with corrupt records
* during parsing.
* <ul>
* <li>`PERMISSIVE` : when it meets a corrupted record, puts the malformed string into a
* field configured by `columnNameOfCorruptRecord`, and sets other fields to `null`. To
* keep corrupt records, an user can set a string type field named
* `columnNameOfCorruptRecord` in an user-defined schema. If a schema does not have the
* field, it drops corrupt records during parsing. When inferring a schema, it implicitly
* adds a `columnNameOfCorruptRecord` field in an output schema.</li>
* <li>`DROPMALFORMED` : ignores the whole corrupted records.</li>
* <li>`FAILFAST` : throws an exception when it meets corrupted records.</li>
* </ul>
* </li>
* <li>`columnNameOfCorruptRecord` (default is the value specified in

@HyukjinKwon
Copy link
Member

ping @niofire

@srowen
Copy link
Member

srowen commented Oct 4, 2018

This is fine, although unfortunately there are lots of instances of this type of formatting error. We don't need to fix them all, but what about looking for similar instances in Column and Dataset at least? I can see a few.

@srowen
Copy link
Member

srowen commented Oct 9, 2018

Ping @niofire to update or close

@niofire
Copy link
Author

niofire commented Oct 10, 2018

@srowen Guessing you're referring to some other files? Could you link?

@SparkQA
Copy link

SparkQA commented Oct 10, 2018

Test build #97204 has finished for PR 22593 at commit f356cf3.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Oct 10, 2018

Have a look at Column.scala and Dataset.scala in org.apache.spark.sql. But, on a second look, this is how I see the lists render:

screen shot 2018-10-10 at 08 35 26

That already seems fine. Are you looking at javadoc for these classes? scaladoc looks OK.

@niofire
Copy link
Author

niofire commented Oct 10, 2018

From https://spark.apache.org/docs/2.3.2/api/java/org/apache/spark/sql/streaming/DataStreamWriter.html
image
I'm guessing this is referring to the Java API - Is there some java source file that contains those comments, are is the scala comments mirrored to the Java API documentation?

@srowen
Copy link
Member

srowen commented Oct 10, 2018

Although it's the Scala API, it's callable from Java just as well. There's no Java-specific API here. So, yeah, actually it makes sense to have javadoc and scaladoc for this. And I think there are probably many markdown-like things that work in scaladoc not in javadoc. While you're welcome to fix all of it ideally, it may be way too much. OK, I'd say go ahead but at least fix all the instances of bullet lists in the .sql package. It's easy enough to search for something like \*\s+- as a regex to spot instances.

@SparkQA
Copy link

SparkQA commented Oct 10, 2018

Test build #97205 has finished for PR 22593 at commit 0620ede.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@niofire
Copy link
Author

niofire commented Oct 11, 2018

Seems like this wasn't able to build due to a transient error with CI

@niofire
Copy link
Author

niofire commented Oct 11, 2018

org.apache.spark.sql.hive.client.HiveClientSuites.(It is not a test it is a sbt.testing.SuiteSelector)
-> Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Oct 11, 2018

Test build #97268 has finished for PR 22593 at commit 37b4684.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Oct 11, 2018

Hm, that's a weird error. Big javadoc failures from unrelated classes. This looks like errors you get when you run javadoc on translated Scala classes. No idea why it's just popped up. Let me run again.

[error] /home/jenkins/workspace/SparkPullRequestBuilder/core/target/java/org/apache/spark/serializer/SerializationDebugger.java:159: error: cannot find symbol
[error]   static private  org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassReflection reflect ()  { throw new RuntimeException(); }
[error]                                                                    ^
[error]   symbol:   class ObjectStreamClassReflection
[error]   location: class SerializationDebugger
[error] /home/jenkins/workspace/SparkPullRequestBuilder/core/target/java/org/apache/spark/serializer/SerializationDebugger.java:22: error: class SerializationDebugger is already defined in package org.apache.spark.serializer
[error]   static private  class SerializationDebugger {
[error]                   ^
[error] /home/jenkins/workspace/SparkPullRequestBuilder/core/target/java/org/apache/spark/util/UtilsSuite.java:4: error: modifier static not allowed here
[error]     static public  class MalformedClass {
[error]                    ^

@SparkQA
Copy link

SparkQA commented Oct 11, 2018

Test build #4371 has finished for PR 22593 at commit 37b4684.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Oct 11, 2018

Huh, I'm really confused why this would fail, or at least, start failing right now. We use these HTML tags elsewhere. You could try updating the unidoc plugin version, but I think it's a genjavadoc issue if anything. But we're on the latest genjavadoc. Really confused!

@niofire
Copy link
Author

niofire commented Oct 11, 2018

Is this happening for other PRs as well?

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Oct 11, 2018

Usually no.. i only saw this once in Tdas's PR before - at that time the order of methods in the change was related


/**
* <ul>
* Specifies the behavior when data or table already exists. Options include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one looks wrongly formatted btw. Looks ul should be below of this line

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crap yeah, will move it down

@SparkQA
Copy link

SparkQA commented Oct 11, 2018

Test build #97265 has finished for PR 22593 at commit 0620ede.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Also, let's mention this PR targets to fix javadoc in the PR description and/or title.

@niofire niofire changed the title [Streaming][DOC] Fix typo & format in DataStreamWriter.scala [Streaming][DOC] Fix typo & formatting for JavaDoc Oct 12, 2018
@SparkQA
Copy link

SparkQA commented Oct 12, 2018

Test build #97311 has finished for PR 22593 at commit d7487c5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Oct 12, 2018

Merged to master/2.4

asfgit pushed a commit that referenced this pull request Oct 12, 2018
## What changes were proposed in this pull request?
- Fixed typo for function outputMode
      - OutputMode.Complete(), changed `these is some updates` to `there are some updates`
- Replaced hyphenized list by HTML unordered list tags in comments to fix the Javadoc documentation.

Current render from most recent [Spark API Docs](https://spark.apache.org/docs/2.3.1/api/java/org/apache/spark/sql/streaming/DataStreamWriter.html):

#### outputMode(OutputMode) - List formatted as a prose.

![image](https://user-images.githubusercontent.com/2295469/46250648-11086700-c3f4-11e8-8a5a-d88b079c165d.png)

#### outputMode(String) - List formatted as a prose.
![image](https://user-images.githubusercontent.com/2295469/46250651-24b3cd80-c3f4-11e8-9dac-ae37599afbce.png)

#### partitionBy(String*) - List formatted as a prose.
![image](https://user-images.githubusercontent.com/2295469/46250655-36957080-c3f4-11e8-990b-47bd612d3c51.png)

## How was this patch tested?
This PR contains a document patch ergo no functional testing is required.

Closes #22593 from niofire/fix-typo-datastreamwriter.

Authored-by: Mathieu St-Louis <mastloui@microsoft.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
(cherry picked from commit 4e141a4)
Signed-off-by: Sean Owen <sean.owen@databricks.com>
@asfgit asfgit closed this in 4e141a4 Oct 12, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?
- Fixed typo for function outputMode
      - OutputMode.Complete(), changed `these is some updates` to `there are some updates`
- Replaced hyphenized list by HTML unordered list tags in comments to fix the Javadoc documentation.

Current render from most recent [Spark API Docs](https://spark.apache.org/docs/2.3.1/api/java/org/apache/spark/sql/streaming/DataStreamWriter.html):

#### outputMode(OutputMode) - List formatted as a prose.

![image](https://user-images.githubusercontent.com/2295469/46250648-11086700-c3f4-11e8-8a5a-d88b079c165d.png)

#### outputMode(String) - List formatted as a prose.
![image](https://user-images.githubusercontent.com/2295469/46250651-24b3cd80-c3f4-11e8-9dac-ae37599afbce.png)

#### partitionBy(String*) - List formatted as a prose.
![image](https://user-images.githubusercontent.com/2295469/46250655-36957080-c3f4-11e8-990b-47bd612d3c51.png)

## How was this patch tested?
This PR contains a document patch ergo no functional testing is required.

Closes apache#22593 from niofire/fix-typo-datastreamwriter.

Authored-by: Mathieu St-Louis <mastloui@microsoft.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants