Skip to content

Commit

Permalink
Multi-column statistics docs update; updated STATISTICS statement exa…
Browse files Browse the repository at this point in the history
…mples and notes
  • Loading branch information
ericharmeling committed Aug 26, 2020
1 parent 687a980 commit 3a7ce05
Show file tree
Hide file tree
Showing 10 changed files with 312 additions and 145 deletions.
4 changes: 2 additions & 2 deletions _includes/v19.2/misc/delete-statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ To delete statistics for all tables in all databases:
> DELETE FROM system.table_statistics WHERE true;
~~~

To delete a named set of statistics (e.g, one named "my_stats"), run a query like the following:
To delete a named set of statistics (e.g, one named "users_stats"), run a query like the following:

{% include copy-clipboard.html %}
~~~ sql
> DELETE FROM system.table_statistics WHERE name = 'my_stats';
> DELETE FROM system.table_statistics WHERE name = 'users_stats';
~~~

After deleting statistics, restart the nodes in your cluster to clear the statistics caches.
Expand Down
4 changes: 2 additions & 2 deletions _includes/v20.1/misc/delete-statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ To delete statistics for all tables in all databases:
> DELETE FROM system.table_statistics WHERE true;
~~~

To delete a named set of statistics (e.g, one named "my_stats"), run a query like the following:
To delete a named set of statistics (e.g, one named "users_stats"), run a query like the following:

{% include copy-clipboard.html %}
~~~ sql
> DELETE FROM system.table_statistics WHERE name = 'my_stats';
> DELETE FROM system.table_statistics WHERE name = 'users_stats';
~~~

After deleting statistics, restart the nodes in your cluster to clear the statistics caches.
Expand Down
4 changes: 2 additions & 2 deletions _includes/v20.2/misc/delete-statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ To delete statistics for all tables in all databases:
> DELETE FROM system.table_statistics WHERE true;
~~~

To delete a named set of statistics (e.g, one named "my_stats"), run a query like the following:
To delete a named set of statistics (e.g, one named "users_stats"), run a query like the following:

{% include copy-clipboard.html %}
~~~ sql
> DELETE FROM system.table_statistics WHERE name = 'my_stats';
> DELETE FROM system.table_statistics WHERE name = 'users_stats';
~~~

After deleting statistics, restart the nodes in your cluster to clear the statistics caches.
Expand Down
99 changes: 68 additions & 31 deletions v19.2/create-statistics.md

Large diffs are not rendered by default.

24 changes: 7 additions & 17 deletions v19.2/show-statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ toc: true
---
The `SHOW STATISTICS` [statement](sql-statements.html) lists [table statistics](create-statistics.html) used by the [cost-based optimizer](cost-based-optimizer.html).

{{site.data.alerts.callout_info}}
[By default, CockroachDB automatically generates statistics](cost-based-optimizer.html#table-statistics) on all indexed columns, and up to 100 non-indexed columns.
{{site.data.alerts.end}}

## Synopsis

<div>
Expand All @@ -23,27 +27,13 @@ Parameter | Description

## Examples

### List table statistics

{% include copy-clipboard.html %}
~~~ sql
> CREATE STATISTICS students ON id FROM students_by_list;
~~~
{% include {{page.version.version}}/sql/movr-statements.md %}

~~~
CREATE STATISTICS
~~~
### List table statistics

{% include copy-clipboard.html %}
~~~ sql
> SHOW STATISTICS FOR TABLE students_by_list;
~~~

~~~
statistics_name | column_names | created | row_count | distinct_count | null_count | histogram_id
+-----------------+--------------+----------------------------------+-----------+----------------+------------+--------------+
students | {"id"} | 2018-10-26 15:06:34.320165+00:00 | 0 | 0 | 0 | NULL
(1 row)
> SHOW STATISTICS FOR TABLE rides;
~~~

### Delete statistics
Expand Down
99 changes: 68 additions & 31 deletions v20.1/create-statistics.md

Large diffs are not rendered by default.

34 changes: 20 additions & 14 deletions v20.1/show-statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ toc: true
---
The `SHOW STATISTICS` [statement](sql-statements.html) lists [table statistics](create-statistics.html) used by the [cost-based optimizer](cost-based-optimizer.html).

{{site.data.alerts.callout_info}}
[By default, CockroachDB automatically generates statistics](cost-based-optimizer.html#table-statistics) on all indexed columns, and up to 100 non-indexed columns.
{{site.data.alerts.end}}

## Synopsis

<div>
Expand All @@ -23,27 +27,29 @@ Parameter | Description

## Examples

### List table statistics
{% include {{page.version.version}}/sql/movr-statements.md %}

{% include copy-clipboard.html %}
~~~ sql
> CREATE STATISTICS students ON id FROM students_by_list;
~~~

~~~
CREATE STATISTICS
~~~
### List table statistics

{% include copy-clipboard.html %}
~~~ sql
> SHOW STATISTICS FOR TABLE students_by_list;
> SHOW STATISTICS FOR TABLE rides;
~~~

~~~
statistics_name | column_names | created | row_count | distinct_count | null_count | histogram_id
+-----------------+--------------+----------------------------------+-----------+----------------+------------+--------------+
students | {"id"} | 2018-10-26 15:06:34.320165+00:00 | 0 | 0 | 0 | NULL
(1 row)
statistics_name | column_names | created | row_count | distinct_count | null_count | histogram_id
------------------+-----------------+----------------------------------+-----------+----------------+------------+---------------------
__auto__ | {city} | 2020-08-26 17:17:13.852138+00:00 | 500 | 9 | 0 | 584554361172525057
__auto__ | {vehicle_city} | 2020-08-26 17:17:13.852138+00:00 | 500 | 9 | 0 | 584554361179242497
__auto__ | {id} | 2020-08-26 17:17:13.852138+00:00 | 500 | 500 | 0 | NULL
__auto__ | {rider_id} | 2020-08-26 17:17:13.852138+00:00 | 500 | 50 | 0 | NULL
__auto__ | {vehicle_id} | 2020-08-26 17:17:13.852138+00:00 | 500 | 15 | 0 | NULL
__auto__ | {start_address} | 2020-08-26 17:17:13.852138+00:00 | 500 | 500 | 0 | NULL
__auto__ | {end_address} | 2020-08-26 17:17:13.852138+00:00 | 500 | 500 | 0 | NULL
__auto__ | {start_time} | 2020-08-26 17:17:13.852138+00:00 | 500 | 30 | 0 | NULL
__auto__ | {end_time} | 2020-08-26 17:17:13.852138+00:00 | 500 | 367 | 0 | NULL
__auto__ | {revenue} | 2020-08-26 17:17:13.852138+00:00 | 500 | 100 | 0 | NULL
(10 rows)
~~~

### Delete statistics
Expand Down
8 changes: 7 additions & 1 deletion v20.2/cost-based-optimizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,13 @@ The most important factor in determining the quality of a plan is cardinality (i

The cost-based optimizer can often find more performant query plans if it has access to statistical data on the contents of your tables. This data needs to be generated from scratch for new tables, and regenerated periodically for existing tables.

By default, CockroachDB generates table statistics automatically when tables are [created](create-table.html), and as they are [updated](update.html). It does this [using a background job](create-statistics.html#view-statistics-jobs) that automatically determines which columns to get statistics on &mdash; specifically, it chooses:
By default, CockroachDB automatically generates table statistics when tables are [created](create-table.html), and as they are [updated](update.html). It does this [using a background job](create-statistics.html#view-statistics-jobs) that automatically determines which columns to get statistics on &mdash; specifically, it chooses:

- Columns that are part of the primary key or an index (in other words, all indexed columns).
- Up to 100 non-indexed columns.

<span class="version-tag">New in v20.2:</span> By default, CockroachDB also automatically collects [multi-column statistics](create-statistics.html#create-statistics-on-multiple-columns) on columns that prefix an index.

{{site.data.alerts.callout_info}}
[Schema changes](online-schema-changes.html) trigger automatic statistics collection for the affected table(s).
{{site.data.alerts.end}}
Expand Down Expand Up @@ -80,6 +82,10 @@ For instructions showing how to manually generate statistics, see the examples i

By default, the optimizer collects histograms for all index columns (specifically the first column in each index) during automatic statistics collection. If a single column statistic is explicitly requested using manual invocation of [`CREATE STATISTICS`](create-statistics.html), a histogram will be collected, regardless of whether or not the column is part of an index.

{{site.data.alerts.callout_info}}
CockroachDB does not support multi-column histograms yet. See [tracking issue](https://github.com/cockroachdb/cockroach/issues/49698).
{{site.data.alerts.end}}

If you are an advanced user and need to disable histogram collection for troubleshooting or performance tuning reasons, change the [`sql.stats.histogram_collection.enabled` cluster setting](cluster-settings.html) by running [`SET CLUSTER SETTING`](set-cluster-setting.html) as follows:

{% include copy-clipboard.html %}
Expand Down
Loading

0 comments on commit 3a7ce05

Please sign in to comment.