Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more statement attributes to explain plan result. #14391

Merged
merged 19 commits into from
Jun 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
233 changes: 228 additions & 5 deletions docs/querying/sql-translation.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,14 @@ be translated to native.
EXPLAIN PLAN statements return:
- a `PLAN` column that contains a JSON array of native queries that Druid will run
- a `RESOURCES` column that describes the resources used in the query
- a `ATTRIBUTES` column that describes the attributes of a query, such as the statement type and target data source
- an `ATTRIBUTES` column that describes the attributes of the query, including:
- `statementType`: the SQL statement type
- `targetDataSource`: the target datasource in an INSERT or REPLACE statement
- `partitionedBy`: the time-based partitioning granularity in an INSERT or REPLACE statement
- `clusteredBy`: the clustering columns in an INSERT or REPLACE statement
- `replaceTimeChunks`: the time chunks in a REPLACE statement

For example, consider the following query:
Example 1: EXPLAIN PLAN for a `SELECT` query on the `wikipedia` datasource:

```sql
EXPLAIN PLAN FOR
Expand All @@ -81,7 +86,7 @@ WHERE channel IN (SELECT page FROM wikipedia GROUP BY page ORDER BY COUNT(*) DES
GROUP BY channel
```

The EXPLAIN PLAN statement returns the following result with plan, resources, and attributes information in it:
The above EXPLAIN PLAN query returns the following result:

```json
[
Expand Down Expand Up @@ -215,8 +220,226 @@ The EXPLAIN PLAN statement returns the following result with plan, resources, an
}
],
{
"statementType": "SELECT",
"targetDataSource": null
"statementType": "SELECT"
}
]
```

Example 2: EXPLAIN PLAN for a `REPLACE` query that replaces all the data in the `wikipedia` datasource:

```sql
EXPLAIN PLAN FOR
REPLACE INTO wikipedia
OVERWRITE ALL
SELECT
TIME_PARSE("timestamp") AS __time,
namespace,
cityName,
countryName,
regionIsoCode,
metroCode,
countryIsoCode,
regionName
FROM TABLE(
EXTERN(
'{"type":"http","uris":["https://druid.apache.org/data/wikipedia.json.gz"]}',
'{"type":"json"}',
'[{"name":"timestamp","type":"string"},{"name":"namespace","type":"string"},{"name":"cityName","type":"string"},{"name":"countryName","type":"string"},{"name":"regionIsoCode","type":"string"},{"name":"metroCode","type":"long"},{"name":"countryIsoCode","type":"string"},{"name":"regionName","type":"string"}]'
)
)
PARTITIONED BY HOUR
CLUSTERED BY cityName
```

The above EXPLAIN PLAN query returns the following result:

```json
[
[
{
"query": {
"queryType": "scan",
"dataSource": {
"type": "external",
"inputSource": {
"type": "http",
"uris": [
"https://druid.apache.org/data/wikipedia.json.gz"
]
},
"inputFormat": {
"type": "json",
"keepNullColumns": false,
"assumeNewlineDelimited": false,
"useJsonNodeReader": false
},
"signature": [
{
"name": "timestamp",
"type": "STRING"
},
{
"name": "namespace",
"type": "STRING"
},
{
"name": "cityName",
"type": "STRING"
},
{
"name": "countryName",
"type": "STRING"
},
{
"name": "regionIsoCode",
"type": "STRING"
},
{
"name": "metroCode",
"type": "LONG"
},
{
"name": "countryIsoCode",
"type": "STRING"
},
{
"name": "regionName",
"type": "STRING"
}
]
},
"intervals": {
"type": "intervals",
"intervals": [
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
]
},
"virtualColumns": [
{
"type": "expression",
"name": "v0",
"expression": "timestamp_parse(\"timestamp\",null,'UTC')",
"outputType": "LONG"
}
],
"resultFormat": "compactedList",
"orderBy": [
{
"columnName": "cityName",
"order": "ascending"
}
],
"columns": [
"cityName",
"countryIsoCode",
"countryName",
"metroCode",
"namespace",
"regionIsoCode",
"regionName",
"v0"
],
"legacy": false,
"context": {
"finalizeAggregations": false,
"groupByEnableMultiValueUnnesting": false,
"maxNumTasks": 5,
"queryId": "b474c0d5-a5ce-432d-be94-535ccdb7addc",
"scanSignature": "[{\"name\":\"cityName\",\"type\":\"STRING\"},{\"name\":\"countryIsoCode\",\"type\":\"STRING\"},{\"name\":\"countryName\",\"type\":\"STRING\"},{\"name\":\"metroCode\",\"type\":\"LONG\"},{\"name\":\"namespace\",\"type\":\"STRING\"},{\"name\":\"regionIsoCode\",\"type\":\"STRING\"},{\"name\":\"regionName\",\"type\":\"STRING\"},{\"name\":\"v0\",\"type\":\"LONG\"}]",
"sqlInsertSegmentGranularity": "\"HOUR\"",
"sqlQueryId": "b474c0d5-a5ce-432d-be94-535ccdb7addc",
"sqlReplaceTimeChunks": "all"
},
"granularity": {
"type": "all"
}
},
"signature": [
{
"name": "v0",
"type": "LONG"
},
{
"name": "namespace",
"type": "STRING"
},
{
"name": "cityName",
"type": "STRING"
},
{
"name": "countryName",
"type": "STRING"
},
{
"name": "regionIsoCode",
"type": "STRING"
},
{
"name": "metroCode",
"type": "LONG"
},
{
"name": "countryIsoCode",
"type": "STRING"
},
{
"name": "regionName",
"type": "STRING"
}
],
"columnMappings": [
{
"queryColumn": "v0",
"outputColumn": "__time"
},
{
"queryColumn": "namespace",
"outputColumn": "namespace"
},
{
"queryColumn": "cityName",
"outputColumn": "cityName"
},
{
"queryColumn": "countryName",
"outputColumn": "countryName"
},
{
"queryColumn": "regionIsoCode",
"outputColumn": "regionIsoCode"
},
{
"queryColumn": "metroCode",
"outputColumn": "metroCode"
},
{
"queryColumn": "countryIsoCode",
"outputColumn": "countryIsoCode"
},
{
"queryColumn": "regionName",
"outputColumn": "regionName"
}
]
}
],
[
{
"name": "EXTERNAL",
"type": "EXTERNAL"
},
{
"name": "wikipedia",
"type": "DATASOURCE"
}
],
{
"statementType": "REPLACE",
"targetDataSource": "wikipedia",
"partitionedBy": "HOUR",
"clusteredBy": "`cityName`",
"replaceTimeChunks": "'ALL'"
}
]
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,11 @@

package org.apache.druid.sql.calcite.planner;

import com.fasterxml.jackson.annotation.JsonInclude;
import com.fasterxml.jackson.annotation.JsonProperty;
import org.apache.calcite.sql.SqlNode;
import org.apache.calcite.sql.SqlNodeList;
import org.apache.druid.java.util.common.granularity.Granularity;

import javax.annotation.Nullable;

Expand All @@ -34,12 +37,28 @@ public final class ExplainAttributes
@Nullable
private final SqlNode targetDataSource;

@Nullable
private final Granularity partitionedBy;

@Nullable
private final SqlNodeList clusteredBy;

@Nullable
private final SqlNode replaceTimeChunks;

public ExplainAttributes(
@JsonProperty("statementType") final String statementType,
@JsonProperty("targetDataSource") @Nullable final SqlNode targetDataSource)
@JsonProperty("targetDataSource") @Nullable final SqlNode targetDataSource,
@JsonProperty("partitionedBy") @Nullable final Granularity partitionedBy,
@JsonProperty("clusteredBy") @Nullable final SqlNodeList clusteredBy,
@JsonProperty("replaceTimeChunks") @Nullable final SqlNode replaceTimeChunks
)
{
this.statementType = statementType;
this.targetDataSource = targetDataSource;
this.partitionedBy = partitionedBy;
this.clusteredBy = clusteredBy;
this.replaceTimeChunks = replaceTimeChunks;
}

/**
Expand All @@ -53,21 +72,61 @@ public String getStatementType()

/**
* @return the target datasource in a SQL statement. Returns null
* for SELECT/non-DML statements where there is no target datasource.
* for SELECT statements where there is no target datasource.
*/
@Nullable
@JsonProperty
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getTargetDataSource()
{
return targetDataSource == null ? null : targetDataSource.toString();
}

/**
* @return the time-based partitioning granularity specified in the <code>PARTITIONED BY</code> clause
* for an INSERT or REPLACE statement. Returns null for SELECT statements.
*/
@Nullable
@JsonProperty
@JsonInclude(JsonInclude.Include.NON_NULL)
public Granularity getPartitionedBy()
{
return partitionedBy;
}

/**
* @return the clustering columns specified in the <code>CLUSTERED BY</code> clause
* for an INSERT or REPLACE statement. Returns null for SELECT statements.
*/
@Nullable
@JsonProperty
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getClusteredBy()
{
return clusteredBy == null ? null : clusteredBy.toString();
}

/**
* @return the time chunks specified in the <code>OVERWRITE</code> clause
* for a REPLACE statement. Returns null for INSERT and SELECT statements.
*/
@Nullable
@JsonProperty
@JsonInclude(JsonInclude.Include.NON_NULL)
Copy link
Contributor

@kfaraz kfaraz Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work to have this annotation just once at the class level instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems like we do a mix of method-level and class-level annotations in the codebase. Since we already do @Nullable consistently for all the methods in this class, I just stuck with the former.

public String getReplaceTimeChunks()
{
return replaceTimeChunks == null ? null : replaceTimeChunks.toString();
}

@Override
public String toString()
{
return "ExplainAttributes{" +
"statementType='" + statementType + '\'' +
", targetDataSource=" + targetDataSource +
", partitionedBy=" + partitionedBy +
", clusteredBy=" + clusteredBy +
", replaceTimeChunks=" + replaceTimeChunks +
'}';
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,10 @@ public ExplainAttributes explainAttributes()
{
return new ExplainAttributes(
DruidSqlInsert.OPERATOR.getName(),
sqlNode.getTargetTable()
sqlNode.getTargetTable(),
sqlNode.getPartitionedBy(),
sqlNode.getClusteredBy(),
null
);
}
}
Expand Down Expand Up @@ -346,7 +349,10 @@ public ExplainAttributes explainAttributes()
{
return new ExplainAttributes(
DruidSqlReplace.OPERATOR.getName(),
sqlNode.getTargetTable()
sqlNode.getTargetTable(),
sqlNode.getPartitionedBy(),
sqlNode.getClusteredBy(),
sqlNode.getReplaceTimeQuery()
);
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,9 @@ public ExplainAttributes explainAttributes()
{
return new ExplainAttributes(
"SELECT",
null,
null,
null,
null
);
}
Expand Down
Loading