diff --git a/docs/configuration.md b/docs/configuration.md index b999a6ee2577..e9dbfa2b4f03 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -523,6 +523,16 @@ of the most common options to set are: 3.0.0 + + spark.driver.log.redirectConsoleOutputs + stdout,stderr + + Comma-separated list of the console output kind for driver that needs to redirect + to logging system. Supported values are `stdout`, `stderr`. It only takes affect when + `spark.plugins` is configured with `org.apache.spark.deploy.RedirectConsolePlugin`. + + 4.1.0 + spark.decommission.enabled false @@ -772,6 +782,16 @@ Apart from these, the following properties are also available, and may be useful 1.1.0 + + spark.executor.logs.redirectConsoleOutputs + stdout,stderr + + Comma-separated list of the console output kind for executor that needs to redirect + to logging system. Supported values are `stdout`, `stderr`. It only takes affect when + `spark.plugins` is configured with `org.apache.spark.deploy.RedirectConsolePlugin`. + + 4.1.0 + spark.executor.userClassPathFirst false @@ -857,6 +877,47 @@ Apart from these, the following properties are also available, and may be useful 1.2.0 + + spark.python.factory.idleWorkerMaxPoolSize + (none) + + Maximum number of idle Python workers to keep. If unset, the number is unbounded. + If set to a positive integer N, at most N idle workers are retained; + least-recently used workers are evicted first. + + 4.1.0 + + + spark.python.worker.killOnIdleTimeout + false + + Whether Spark should terminate the Python worker process when the idle timeout + (as defined by spark.python.worker.idleTimeoutSeconds) is reached. If enabled, + Spark will terminate the Python worker process in addition to logging the status. + + 4.1.0 + + + spark.python.worker.tracebackDumpIntervalSeconds + 0 + + The interval (in seconds) for Python workers to dump their tracebacks. + If it's positive, the Python worker will periodically dump the traceback into + its `stderr`. The default is `0` that means it is disabled. + + 4.1.0 + + + spark.python.unix.domain.socket.enabled + false + + When set to true, the Python driver uses a Unix domain socket for operations like + creating or collecting a DataFrame from local data, using accumulators, and executing + Python functions with PySpark such as Python UDFs. This configuration only applies + to Spark Classic and Spark Connect server. + + 4.1.0 + spark.files @@ -873,6 +934,16 @@ Apart from these, the following properties are also available, and may be useful 1.0.1 + + spark.submit.callSystemExitOnMainExit + false + + If true, SparkSubmit will call System.exit() to initiate JVM shutdown once the + user's main method has exited. This can be useful in cases where non-daemon JVM + threads might otherwise prevent the JVM from shutting down on its own. + + 4.1.0 + spark.jars @@ -1431,6 +1502,14 @@ Apart from these, the following properties are also available, and may be useful 3.0.0 + + spark.eventLog.excludedPatterns + (none) + + Specifies comma-separated event names to be excluded from the event logs. + + 4.1.0 + spark.eventLog.dir file:///tmp/spark-events @@ -1905,6 +1984,15 @@ Apart from these, the following properties are also available, and may be useful 3.2.0 + + spark.io.compression.zstd.strategy + (none) + + Compression strategy for Zstd compression codec. The higher the value is, the more + complex it becomes, usually resulting stronger but slower compression or higher CPU cost. + + 4.1.0 + spark.io.compression.zstd.workers 0 @@ -2092,6 +2180,17 @@ Apart from these, the following properties are also available, and may be useful 1.6.0 + + spark.memory.unmanagedMemoryPollingInterval + 0s + + Interval for polling unmanaged memory users to track their memory usage. + Unmanaged memory users are components that manage their own memory outside of + Spark's core memory management, such as RocksDB for Streaming State Store. + Setting this to 0 disables unmanaged memory polling. + + 4.1.0 + spark.storage.unrollMemoryThreshold 1024 * 1024 @@ -2543,6 +2642,16 @@ Apart from these, the following properties are also available, and may be useful 0.7.0 + + spark.driver.metrics.pollingInterval + 10s + + How often to collect driver metrics (in milliseconds). + If unset, the polling is done at the executor heartbeat interval. + If set, the polling is done at this interval. + + 4.1.0 + spark.rpc.io.backLog 64 diff --git a/docs/monitoring.md b/docs/monitoring.md index 49d04b328f29..e75f83110d19 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -401,6 +401,14 @@ Security options for the Spark History Server are covered more detail in the 3.0.0 + + spark.history.fs.eventLog.rolling.onDemandLoadEnabled + true + + Whether to look up rolling event log locations on demand manner before listing files. + + 4.1.0 + spark.history.store.hybridStore.enabled false