Table of contents
- Introduction
- Breaking Change
- plugins.sql.enabled
- plugins.sql.slowlog
- plugins.sql.cursor.keep_alive
- plugins.sql.pagination.api
- plugins.query.size_limit
- plugins.query.memory_limit
- plugins.sql.delete.enabled
- plugins.query.executionengine.spark.session.limit
- plugins.query.executionengine.spark.refresh_job.limit
- plugins.query.datasources.limit
- plugins.query.executionengine.spark.session_inactivity_timeout_millis
- plugins.query.executionengine.spark.auto_index_management.enabled
- plugins.query.executionengine.spark.session.index.ttl
- plugins.query.executionengine.spark.result.index.ttl
- plugins.query.executionengine.async_query.enabled
- plugins.query.executionengine.async_query.external_scheduler.enabled
- plugins.query.executionengine.async_query.external_scheduler.interval
- plugins.query.executionengine.spark.streamingjobs.housekeeper.interval
- plugins.query.datasources.enabled
- plugins.query.field_type_tolerance
When OpenSearch bootstraps, SQL plugin will register a few settings in OpenSearch cluster settings. Most of the settings are able to change dynamically so you can control the behavior of SQL plugin without need to bounce your cluster. You can update the settings by sending requests to either _cluster/settings
or _plugins/_query/settings
endpoint, though the examples are sending to the latter.
The opendistro.sql.engine.new.enabled setting is deprecated and will be removed then. From OpenSearch 1.0, the new engine is always enabled.
The opendistro.sql.query.analysis.enabled setting is deprecated and will be removed then. From OpenSearch 1.0, the query analysis in legacy engine is disabled.
The opendistro.sql.query.analysis.semantic.suggestion setting is deprecated and will be removed then. From OpenSearch 1.0, the query analysis suggestion in legacy engine is disabled.
The opendistro.sql.query.analysis.semantic.threshold setting is deprecated and will be removed then. From OpenSearch 1.0, the query analysis threshold in legacy engine is disabled.
The opendistro.sql.query.response.format setting is deprecated and will be removed then. From OpenSearch 1.0, the query response format is default to JDBC format. `You can change the format by using query parameters<../interfaces/protocol.rst>`_.
The opendistro.sql.cursor.enabled setting is deprecated and will be removed then. From OpenSearch 1.0, the cursor feature is enabled by default.
The opendistro.sql.cursor.fetch_size setting is deprecated and will be removed then. From OpenSearch 1.0, the fetch_size in query body will decide whether create the cursor context. No cursor will be created if the fetch_size = 0.
You can disable SQL plugin to reject all coming requests.
- The default value is true.
- This setting is node scope.
- This setting can be updated dynamically.
You can update the setting with a new value like this.
SQL query:
>> curl -H 'Content-Type: application/json' -X PUT localhost:9200/_plugins/_query/settings -d '{
"transient" : {
"plugins.sql.enabled" : "false"
}
}'
Result set:
{
"acknowledged" : true,
"persistent" : { },
"transient" : {
"plugins" : {
"sql" : {
"enabled" : "false"
}
}
}
}
Note: the legacy settings of opendistro.sql.enabled
is deprecated, it will fallback to the new settings if you request an update with the legacy name.
Query result after the setting updated is like:
SQL query:
>> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql -d '{
"query" : "SELECT * FROM accounts"
}'
Result set:
{
"error" : {
"reason" : "Invalid SQL query",
"details" : "Either plugins.sql.enabled or rest.action.multi.allow_explicit_index setting is false",
"type" : "SQLFeatureDisabledException"
},
"status" : 400
}
You can configure the time limit (seconds) for slow query which would be logged as 'Slow query: elapsed=xxx (ms)' in opensearch.log.
- The default value is 2.
- This setting is node scope.
- This setting can be updated dynamically.
You can update the setting with a new value like this.
SQL query:
>> curl -H 'Content-Type: application/json' -X PUT localhost:9200/_plugins/_query/settings -d '{
"transient" : {
"plugins.query.slowlog" : "10"
}
}'
Result set:
{
"acknowledged" : true,
"persistent" : { },
"transient" : {
"plugins" : {
"query" : {
"slowlog" : "10"
}
}
}
}
Note: the legacy settings of opendistro.sql.slowlog
is deprecated, it will fallback to the new settings if you request an update with the legacy name.
User can set this value to indicate how long the cursor context should be kept open. Cursor contexts are resource heavy, and a lower value should be used if possible.
- The default value is 1m.
- This setting is node scope.
- This setting can be updated dynamically.
You can update the setting with a new value like this.
SQL query:
>> curl -H 'Content-Type: application/json' -X PUT localhost:9200/_plugins/_query/settings -d '{
"transient" : {
"plugins.sql.cursor.keep_alive" : "5m"
}
}'
Result set:
{
"acknowledged" : true,
"persistent" : { },
"transient" : {
"plugins" : {
"sql" : {
"cursor" : {
"keep_alive" : "5m"
}
}
}
}
}
Note: the legacy settings of opendistro.sql.cursor.keep_alive
is deprecated, it will fallback to the new settings if you request an update with the legacy name.
This setting controls whether the SQL search queries in OpenSearch use Point-In-Time (PIT) with search_after or the traditional scroll mechanism for fetching paginated results.
- Default Value: true
- Possible Values: true or false
- When set to true, the search query in the background uses PIT with search_after instead of scroll to retrieve paginated results. The Cursor Id returned to the user will encode relevant pagination query-related information, which will be used to fetch the subsequent pages of results.
- This setting is node-level.
- This setting can be updated dynamically.
You can update the setting with a new value like this.
SQL query:
>> curl -H 'Content-Type: application/json' -X PUT localhost:9200/_plugins/_query/settings -d '{
"transient" : {
"plugins.sql.pagination.api" : "true"
}
}'
Result set:
{
"acknowledged" : true,
"persistent" : { },
"transient" : {
"plugins" : {
"sql" : {
"pagination" : {
"api" : "true"
}
}
}
}
}
The new engine fetches a default size of index from OpenSearch set by this setting, the default value equals to max result window in index level (10000 by default). You can change the value to any value not greater than the max result window value in index level (index.max_result_window), here is an example:
>> curl -H 'Content-Type: application/json' -X PUT localhost:9200/_plugins/_query/settings -d '{
"transient" : {
"plugins.query.size_limit" : 500
}
}'
Result set:
{
"acknowledged" : true,
"persistent" : { },
"transient" : {
"plugins" : {
"query" : {
"size_limit" : "500"
}
}
}
}
Note: the legacy settings of opendistro.query.size_limit
is deprecated, it will fallback to the new settings if you request an update with the legacy name.
You can set heap memory usage limit for the query engine. When query running, it will detected whether the heap memory usage under the limit, if not, it will terminated the current query. The default value is: 85%. Here is an example:
>> curl -H 'Content-Type: application/json' -X PUT localhost:9200/_plugins/_query/settings -d '{
"transient" : {
"plugins.query.memory_limit" : "80%"
}
}'
Result set:
{
"acknowledged": true,
"persistent": {
"plugins": {
"query": {
"memory_limit": "80%"
}
}
},
"transient": {}
}
Note: the legacy settings of opendistro.ppl.query.memory_limit
is deprecated, it will fallback to the new settings if you request an update with the legacy name.
By default, DELETE clause disabled. You can enable DELETE clause by this setting.
- The default value is false.
- This setting is node scope.
- This setting can be updated dynamically.
You can update the setting with a new value like this.
SQL query:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_plugins/_query/settings \
... -d '{"transient":{"plugins.sql.delete.enabled":"false"}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"sql": {
"delete": {
"enabled": "false"
}
}
}
}
}
Query result after the setting updated is like:
SQL query:
sh$ curl -sS -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql \
... -d '{"query" : "DELETE * FROM accounts"}'
{
"error": {
"reason": "Invalid SQL query",
"details": "DELETE clause is disabled by default and will be deprecated. Using the plugins.sql.delete.enabled setting to enable it",
"type": "SQLFeatureDisabledException"
},
"status": 400
}
Each cluster can have maximum 10 sessions running in parallel by default. You can increase limit by this setting.
- The default value is 10.
- This setting is node scope.
- This setting can be updated dynamically.
You can update the setting with a new value like this.
SQL query:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.spark.session.limit":200}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"spark": {
"session": {
"limit": "200"
}
}
}
}
}
}
}
Each cluster can have maximum 5 refresh job running concurrently. You can increase limit by this setting.
- The default value is 5.
- This setting is node scope.
- This setting can be updated dynamically.
You can update the setting with a new value like this.
SQL query:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.spark.refresh_job.limit":200}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"spark": {
"refresh_job": {
"limit": "200"
}
}
}
}
}
}
}
Each cluster can have maximum 20 datasources. You can increase limit by this setting.
- The default value is 20.
- This setting is node scope.
- This setting can be updated dynamically.
You can update the setting with a new value like this.
SQL query:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.datasources.limit":25}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"datasources": {
"limit": "25"
}
}
}
}
}
This setting determines the duration after which a session is considered stale if there has been no update. The default timeout is 3 minutes (180,000 milliseconds).
- Default Value: 180000 (milliseconds)
- Scope: Node-level
- Dynamic Update: Yes, this setting can be updated dynamically.
To change the session inactivity timeout to 10 minutes for example, use the following command:
SQL query:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.spark.session_inactivity_timeout_millis":600000}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"spark": {
"session_inactivity_timeout_millis": "600000"
}
}
}
}
}
}
This setting controls the automatic management of request and result indices for each data source. When enabled, it deletes outdated index documents.
- Default State: Enabled (true)
- Purpose: Manages auto index management for request and result indices.
To disable auto index management, use the following command:
SQL query:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.spark.auto_index_management.enabled":false}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"spark": {
"auto_index_management": {
"enabled": "false"
}
}
}
}
}
}
}
This setting defines the time-to-live (TTL) for request indices when plugins.query.executionengine.spark.auto_index_management.enabled is true. By default, request indices older than 14 days are deleted.
- Default Value: 30 days
To change the TTL to 60 days for example, execute the following command:
SQL query:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.spark.session.index.ttl":"60d"}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"spark": {
"session": {
"index": {
"ttl": "60d"
}
}
}
}
}
}
}
}
This setting specifies the TTL for result indices when plugins.query.executionengine.spark.auto_index_management.enabled is set to true. The default setting is to delete result indices older than 60 days.
- Default Value: 60 days
To modify the TTL to 30 days for example, use this command:
SQL query:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.spark.result.index.ttl":"30d"}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"spark": {
"result": {
"index": {
"ttl": "30d"
}
}
}
}
}
}
}
}
You can disable submit async query to reject all coming requests.
- The default value is true.
- This setting is node scope.
- This setting can be updated dynamically.
Request:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.async_query.enabled":"false"}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"async_query": {
"enabled": "false"
}
}
}
}
}
}
This setting controls whether the external scheduler is enabled for async queries.
- Default Value: true
- Scope: Node-level
- Dynamic Update: Yes, this setting can be updated dynamically.
To disable the external scheduler, use the following command:
Request
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.async_query.external_scheduler.enabled":"false"}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"async_query": {
"external_scheduler": {
"enabled": "false"
}
}
}
}
}
}
}
This setting defines the interval at which the external scheduler applies for auto refresh queries. It optimizes Spark applications by allowing them to automatically decide whether to use the Spark scheduler or the external scheduler.
- Default Value: None (must be explicitly set)
- Format: A string representing a time duration follows Spark CalendarInterval format (e.g.,
10 minutes
for 10 minutes,1 hour
for 1 hour).
To modify the interval to 10 minutes for example, use this command:
Request
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.async_query.external_scheduler.interval":"10 minutes"}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"async_query": {
"external_scheduler": {
"interval": "10 minutes"
}
}
}
}
}
}
}
This setting specifies the interval at which the streaming job housekeeper runs to clean up streaming jobs associated with deleted and disabled data sources. The default configuration executes this cleanup every 15 minutes.
- Default Value: 15 minutes
To modify the TTL to 30 minutes for example, use this command:
Request
sh$ curl -sS -H 'Content-Type: application/json' -X PUT localhost:9200/_cluster/settings \
... -d '{"transient":{"plugins.query.executionengine.spark.streamingjobs.housekeeper.interval":"30m"}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"executionengine": {
"spark": {
"streamingjobs": {
"housekeeper": {
"interval": "30m"
}
}
}
}
}
}
}
}
This setting controls whether datasources are enabled.
- The default value is true
- This setting is node scope
- This setting can be updated dynamically
Update Settings Request:
sh$ curl -sS -H 'Content-Type: application/json' -X PUT 'localhost:9200/_cluster/settings?pretty' \
... -d '{"transient":{"plugins.query.datasources.enabled":"false"}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"datasources": {
"enabled": "false"
}
}
}
}
}
When Attempting to Call Data Source APIs:
sh$ curl -sS -H 'Content-Type: application/json' -X GET 'localhost:9200/_plugins/_query/_datasources'
{
"status": 400,
"error": {
"type": "OpenSearchStatusException",
"reason": "Invalid Request",
"details": "plugins.query.datasources.enabled setting is false"
}
}
When Attempting to List Data Source:
sh$ curl -sS -H 'Content-Type: application/json' -X POST 'localhost:9200/_plugins/_ppl' \
... -d '{"query":"show datasources"}'
{
"schema": [
{
"name": "DATASOURCE_NAME",
"type": "string"
},
{
"name": "CONNECTOR_TYPE",
"type": "string"
}
],
"datarows": [],
"total": 0,
"size": 0
}
To Re-enable Data Sources::
sh$ curl -sS -H 'Content-Type: application/json' -X PUT 'localhost:9200/_cluster/settings?pretty' \
... -d '{"transient":{"plugins.query.datasources.enabled":"true"}}'
{
"acknowledged": true,
"persistent": {},
"transient": {
"plugins": {
"query": {
"datasources": {
"enabled": "true"
}
}
}
}
}
This setting controls whether preserve arrays. If this setting is set to false, then an array is reduced to the first non array value of any level of nesting.
- The default value is true (preserve arrays)
- This setting is node scope
- This setting can be updated dynamically
Querying a field containing array values will return the full array values:
os> SELECT accounts FROM people;
fetched rows / total rows = 1/1
+-----------------------+
| accounts |
+-----------------------+
| [{'id': 1},{'id': 2}] |
+-----------------------+
Disable field type tolerance:
>> curl -H 'Content-Type: application/json' -X PUT localhost:9200/_plugins/_query/settings -d '{
"transient" : {
"plugins.query.field_type_tolerance" : false
}
}'
When field type tolerance is disabled, arrays are collapsed to the first non array value:
os> SELECT accounts FROM people;
fetched rows / total rows = 1/1
+-----------+
| accounts |
+-----------+
| {'id': 1} |
+-----------+
Reenable field type tolerance:
>> curl -H 'Content-Type: application/json' -X PUT localhost:9200/_plugins/_query/settings -d '{
"transient" : {
"plugins.query.field_type_tolerance" : true
}
}'
OpenSearch does not natively support the ARRAY data type but does allow multi-value fields implicitly. The SQL/PPL plugin adheres strictly to the data type semantics defined in index mappings. When parsing OpenSearch responses, it expects data to match the declared type and does not account for data in array format. If the plugins.query.field_type_tolerance setting is enabled, the SQL/PPL plugin will handle array datasets by returning scalar data types, allowing basic queries (e.g., SELECT * FROM tbl WHERE condition). However, using multi-value fields in expressions or functions will result in exceptions. If this setting is disabled or absent, only the first element of an array is returned, preserving the default behavior.