Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 109 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -523,6 +523,16 @@ of the most common options to set are:
</td>
<td>3.0.0</td>
</tr>
<tr>
<td><code>spark.driver.log.redirectConsoleOutputs</code></td>
<td>stdout,stderr</td>
<td>
Comma-separated list of the console output kind for driver that needs to redirect
to logging system. Supported values are `stdout`, `stderr`. It only takes affect when
`spark.plugins` is configured with `org.apache.spark.deploy.RedirectConsolePlugin`.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.decommission.enabled</code></td>
<td>false</td>
Expand Down Expand Up @@ -772,6 +782,16 @@ Apart from these, the following properties are also available, and may be useful
</td>
<td>1.1.0</td>
</tr>
<tr>
<td><code>spark.executor.logs.redirectConsoleOutputs</code></td>
<td>stdout,stderr</td>
<td>
Comma-separated list of the console output kind for executor that needs to redirect
to logging system. Supported values are `stdout`, `stderr`. It only takes affect when
`spark.plugins` is configured with `org.apache.spark.deploy.RedirectConsolePlugin`.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.executor.userClassPathFirst</code></td>
<td>false</td>
Expand Down Expand Up @@ -857,6 +877,47 @@ Apart from these, the following properties are also available, and may be useful
</td>
<td>1.2.0</td>
</tr>
<tr>
<td><code>spark.python.factory.idleWorkerMaxPoolSize</code></td>
<td>(none)</td>
<td>
Maximum number of idle Python workers to keep. If unset, the number is unbounded.
If set to a positive integer N, at most N idle workers are retained;
least-recently used workers are evicted first.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.python.worker.killOnIdleTimeout</code></td>
<td>false</td>
<td>
Whether Spark should terminate the Python worker process when the idle timeout
(as defined by <code>spark.python.worker.idleTimeoutSeconds</code>) is reached. If enabled,
Spark will terminate the Python worker process in addition to logging the status.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.python.worker.tracebackDumpIntervalSeconds</code></td>
<td>0</td>
<td>
The interval (in seconds) for Python workers to dump their tracebacks.
If it's positive, the Python worker will periodically dump the traceback into
its `stderr`. The default is `0` that means it is disabled.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.python.unix.domain.socket.enabled</code></td>
<td>false</td>
<td>
When set to true, the Python driver uses a Unix domain socket for operations like
creating or collecting a DataFrame from local data, using accumulators, and executing
Python functions with PySpark such as Python UDFs. This configuration only applies
to Spark Classic and Spark Connect server.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.files</code></td>
<td></td>
Expand All @@ -873,6 +934,16 @@ Apart from these, the following properties are also available, and may be useful
</td>
<td>1.0.1</td>
</tr>
<tr>
<td><code>spark.submit.callSystemExitOnMainExit</code></td>
<td>false</td>
<td>
If true, SparkSubmit will call System.exit() to initiate JVM shutdown once the
user's main method has exited. This can be useful in cases where non-daemon JVM
threads might otherwise prevent the JVM from shutting down on its own.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.jars</code></td>
<td></td>
Expand Down Expand Up @@ -1431,6 +1502,14 @@ Apart from these, the following properties are also available, and may be useful
</td>
<td>3.0.0</td>
</tr>
<tr>
<td><code>spark.eventLog.excludedPatterns</code></td>
<td>(none)</td>
<td>
Specifies comma-separated event names to be excluded from the event logs.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.eventLog.dir</code></td>
<td>file:///tmp/spark-events</td>
Expand Down Expand Up @@ -1905,6 +1984,15 @@ Apart from these, the following properties are also available, and may be useful
</td>
<td>3.2.0</td>
</tr>
<tr>
<td><code>spark.io.compression.zstd.strategy</code></td>
<td>(none)</td>
<td>
Compression strategy for Zstd compression codec. The higher the value is, the more
complex it becomes, usually resulting stronger but slower compression or higher CPU cost.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.io.compression.zstd.workers</code></td>
<td>0</td>
Expand Down Expand Up @@ -2092,6 +2180,17 @@ Apart from these, the following properties are also available, and may be useful
</td>
<td>1.6.0</td>
</tr>
<tr>
<td><code>spark.memory.unmanagedMemoryPollingInterval</code></td>
<td>0s</td>
<td>
Interval for polling unmanaged memory users to track their memory usage.
Unmanaged memory users are components that manage their own memory outside of
Spark's core memory management, such as RocksDB for Streaming State Store.
Setting this to 0 disables unmanaged memory polling.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.storage.unrollMemoryThreshold</code></td>
<td>1024 * 1024</td>
Expand Down Expand Up @@ -2543,6 +2642,16 @@ Apart from these, the following properties are also available, and may be useful
</td>
<td>0.7.0</td>
</tr>
<tr>
<td><code>spark.driver.metrics.pollingInterval</code></td>
<td>10s</td>
<td>
How often to collect driver metrics (in milliseconds).
If unset, the polling is done at the executor heartbeat interval.
If set, the polling is done at this interval.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td><code>spark.rpc.io.backLog</code></td>
<td>64</td>
Expand Down
8 changes: 8 additions & 0 deletions docs/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,14 @@ Security options for the Spark History Server are covered more detail in the
</td>
<td>3.0.0</td>
</tr>
<tr>
<td>spark.history.fs.eventLog.rolling.onDemandLoadEnabled</td>
<td>true</td>
<td>
Whether to look up rolling event log locations on demand manner before listing files.
</td>
<td>4.1.0</td>
</tr>
<tr>
<td>spark.history.store.hybridStore.enabled</td>
<td>false</td>
Expand Down