Skip to content

Commit f4d9843

Browse files
committed
Resolved comments
1 parent 3316616 commit f4d9843

File tree

4 files changed

+10
-32
lines changed

4 files changed

+10
-32
lines changed

docs/sql-data-sources-json.md

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -116,12 +116,6 @@ Data source options of JSON can be set via:
116116
</td>
117117
<td>read/write</td>
118118
</tr>
119-
<tr>
120-
<td><code>maxFilesPerTrigger</code></td>
121-
<td>None</td>
122-
<td>Sets the maximum number of new files to be considered in every trigger.</td>
123-
<td>read</td>
124-
</tr>
125119
<tr>
126120
<td><code>primitivesAsString</code></td>
127121
<td>None</td>

python/pyspark/sql/readwriter.py

Lines changed: 0 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -108,35 +108,13 @@ def schema(self, schema):
108108
@since(1.5)
109109
def option(self, key, value):
110110
"""Adds an input option for the underlying data source.
111-
112-
You can set the following option(s) for reading files:
113-
* ``pathGlobFilter``: an optional glob pattern to only include files with paths matching
114-
the pattern. The syntax follows org.apache.hadoop.fs.GlobFilter.
115-
It does not change the behavior of partition discovery.
116-
* ``modifiedBefore``: an optional timestamp to only include files with
117-
modification times occurring before the specified time. The provided timestamp
118-
must be in the following format: YYYY-MM-DDTHH:mm:ss (e.g. 2020-06-01T13:00:00)
119-
* ``modifiedAfter``: an optional timestamp to only include files with
120-
modification times occurring after the specified time. The provided timestamp
121-
must be in the following format: YYYY-MM-DDTHH:mm:ss (e.g. 2020-06-01T13:00:00)
122111
"""
123112
self._jreader = self._jreader.option(key, to_str(value))
124113
return self
125114

126115
@since(1.4)
127116
def options(self, **options):
128117
"""Adds input options for the underlying data source.
129-
130-
You can set the following option(s) for reading files:
131-
* ``pathGlobFilter``: an optional glob pattern to only include files with paths matching
132-
the pattern. The syntax follows org.apache.hadoop.fs.GlobFilter.
133-
It does not change the behavior of partition discovery.
134-
* ``modifiedBefore``: an optional timestamp to only include files with
135-
modification times occurring before the specified time. The provided timestamp
136-
must be in the following format: YYYY-MM-DDTHH:mm:ss (e.g. 2020-06-01T13:00:00)
137-
* ``modifiedAfter``: an optional timestamp to only include files with
138-
modification times occurring after the specified time. The provided timestamp
139-
must be in the following format: YYYY-MM-DDTHH:mm:ss (e.g. 2020-06-01T13:00:00)
140118
"""
141119
for k in options:
142120
self._jreader = self._jreader.option(k, to_str(options[k]))

sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -392,8 +392,8 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging {
392392
*
393393
* You can find the JSON-specific options for reading JSON files in
394394
* <a href="https://spark.apache.org/docs/latest/sql-data-sources-json.html#data-source-option">
395-
* Data Source Option</a>
396-
* and
395+
* Data Source Option</a> in the version you use.
396+
* More general options can be found in
397397
* <a href=
398398
* "https://spark.apache.org/docs/latest/sql-data-sources-generic-options.html">
399399
* Generic Files Source Options</a> in the version you use.

sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -218,10 +218,16 @@ final class DataStreamReader private[sql](sparkSession: SparkSession) extends Lo
218218
* This function goes through the input once to determine the input schema. If you know the
219219
* schema in advance, use the version that specifies the schema to avoid the extra scan.
220220
*
221+
* You can set the following structured streaming option(s):
222+
* <ul>
223+
* <li>`maxFilesPerTrigger` (default: no max limit): sets the maximum number of new files to be
224+
* considered in every trigger.</li>
225+
* </ul>
226+
*
221227
* You can find the JSON-specific options for reading JSON file stream in
222228
* <a href="https://spark.apache.org/docs/latest/sql-data-sources-json.html#data-source-option">
223-
* Data Source Option</a>
224-
* and
229+
* Data Source Option</a> in the version you use.
230+
* More general options can be found in
225231
* <a href=
226232
* "https://spark.apache.org/docs/latest/sql-data-sources-generic-options.html">
227233
* Generic Files Source Options</a> in the version you use.

0 commit comments

Comments
 (0)