-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table access method for compressed hypertables #7104
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this PR also has unrelated commits that are independant of the table access method. It would help ease review if those could be pulled out and moved into separate PRs (e.g. 96141c6) . This PR also seems to be inconsistent, in the initial commit the table access method is named tscompression and later on its changed to hyperstore. It would be useful if those commits were squashed to ease review.
It makes sense to split out several changes into separate pull requests. In some cases these changes are part of other pull requests that modify Hyperstore files, but that can be dealt with by separating out the changes and then rebasing this PR.
I can make an attempt at squashing this when I get back but there is a risk that this change cause ripple effects and might be hard to deal with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that there are a lot of unrelated changes that could have been separate PRs to reduce the amount of changes in this PR.
Looking at this, I found it very surprising that you are changing the table access method on each compression state change. Effectively, you only use the access method if you have a compressed chunk. Ultimately, in my mind, this defeats the purpose of a TAM as an encapsulation method since you have to maintain it the same way as chunk status. I personally would love to get rid of the compressed chunk status so doing this feels like a step in the wrong direction.
For reference, couple of tsbench runs based on this branch: this exact branch vs the commit it was based on: https://grafana.ops.savannah-dev.timescale.com/d/fasYic_4z/compare-akuzm?orgId=1&var-branch=All&var-run1=3606&var-run2=3647&var-threshold=0&var-use_historical_thresholds=true&var-threshold_expression=2.5%20%2A%20percentile_cont%280.90%29&var-exact_suite_version=false this branch + commit that enables default usage of hyperstore compression vs this branch: https://grafana.ops.savannah-dev.timescale.com/d/fasYic_4z/compare-akuzm?orgId=1&var-branch=All&var-run1=3647&var-run2=3648&var-threshold=0&var-use_historical_thresholds=true&var-threshold_expression=2.5%20%2A%20percentile_cont%280.90%29&var-exact_suite_version=false |
Interesting that we have regression in clickbench and join with this branch, probably this is related to the interface changes in compressed batch, but they looked pretty minor. This is something we should fix before merging. |
Can you give examples of any unrelated changes? I think most, if not all, changes are actually there to support the new use case.
It is not clear to me what you mean by "you only use the access method if you have a compressed chunk.", because the whole point of Hyperstore is to have compressed data, and the reason Hyperstore was created was to encapsulate compression in a TAM interface. To use Hyperstore with only non-compressed data, while possible, defeats the purpose of Hyperstore since in that case it is nothing more than a plain heap table with a bunch of downsides. In other words, the ideal state is to have only compressed data. Having non-compressed data is a transitional state. There are, of course, situations where you will still have a lot of non-compressed data with Hyperstore, just like with compression. This happens due to DML decompression, just like before. If you want to (re)compress a Hyperstore you can use the existing APIs (compress_chunk or recompress_chunk) or VACUUM FULL (eventually VACUUM too, probably). The chunk compression status is a different and orthogonal matter. I would also like to get rid of it, but it is not strictly tied to TAM and requires additional changes we wanted to avoid in the initial version. Just a reminder: we will, of course, continue improving and making changes to Hyperstore after this merge. We just can't do everything in one go. |
This also makes it hard to maintain the code should we need to use, e.g., bisect to figure out what change caused an issue. |
135 commits in a single PR is quite big. I think it would speed up reviewing if the PR was split up into smaller parts so it could be merged incrementally. Also it would help separate unrelated changes. |
Currently this PR segfaults when non-btree/non-hash indexes are present. We should probably error out in those cases instead of segfaulting. |
Definitely. Note that in the next version (in progress) we have a whitelist available for index access methods that we are supporting. I think that should solve the issue. |
Additionally, I've done a quick test enabling hyperstore by default and running our test suite against it. There was a few more places with segfaults in addition to creating unsupported index types. What mainly worries me is that there was a segfault on insert in compression_insert test. These definitely need to be cleaned up, at least not to crash the database when trying out hyperstore. |
The number of commits does not necessarily reflect the size of the PR. It is the LOC in the final artifact that matters. A lot of commits evolve code already in previous commits, fixing bugs and issues, etc, so breaking the PR up along commits will actually increase the review burden because you'd be reviewing "historical" code and not the final artifact. This would only increase confusion and review burden. Perhaps there is a way to break it up along files, but that would require a lot of extra work to produce "partial" artifacts that build and pass tests. Honestly, I am not sure it is worth the effort. Perhaps it is something we can discuss. Another point is that the code is already reviewed, commit by commit. So, maybe we don't need/expect the same review burden as we normally expect? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR should be split up into smaller pieces to ease review.
22e714b
to
20fc5f1
Compare
This should be fixed. There are some concurrency/isolation tests that block because the lock behavior of ALTER TABLE is different (this is a PG thing). We might do work later to harmonize the locking with current compression. |
This PR now contains a whitelist and will error out for any access method not on the whitelist. Currently, we only support |
Changed the name to clarify that this is just one piece in the puzzle of the larger effort described in Hyperstore: A Hybrid Row-Columnar Storage Engine for T̶̶̶i̶̶̶m̶̶̶e̶̶̶ ̶̶̶S̶̶̶e̶̶̶r̶̶̶i̶̶̶e̶̶̶s̶̶̶ Real-Time Analytics . |
20fc5f1
to
75b195f
Compare
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.17.0** * The ability to add secondary indexes to the columnstore through the new hypercore table access method. * Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore. * Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests. * Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression. **Dropping support for Bitnami images** After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha) **Deprecation Notice** We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below: | Deprecated | Replacement | Type | | --- | --- | --- | | decompress_chunk | convert_to_rowstore | Procedure | | compress_chunk | convert_to_columnstore | Procedure | | add_compression_policy | add_columnstore_policy | Function | | remove_compression_policy | remove_columnstore_policy | Function | | hypertable_compression_stats | hypertable_columnstore_stats | Function | | chunk_compression_stats | chunk_columnstore_stats | Function | | hypertable_compression_settings | hypertable_columnstore_settings | View | | chunk_compression_settings | chunk_columnstore_settings | View | | compression_settings | columnstore_settings | View | | timescaledb.compress | timescaledb.enable_columnstore | Parameter | | timescaledb.compress_segmentby | timescaledb.segmentby | Parameter | | timescaledb.compress_orderby | timescaledb.orderby | Parameter | **Features** * #7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types). * #7104: Hypercore table access method. * #6901: Add hypertable support for transition tables. * #7482: Optimize recompression of partially compressed chunks. * #7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable. * #7433: Add support for merging chunks. * #7271: Push down `order by` in real-time continuous aggregate queries. * #7455: Support `drop not null` on compressed hypertables. * #7295: Support `alter table set access method` on hypertable. * #7411: Change parameter name to enable hypercore table access method. * #7436: Add index creation on `order by` columns. * #7443: Add hypercore function and view aliases. * #7521: Add optional `force` argument to `refresh_continuous_aggregate`. * #7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases. * #7565: Add hint when hypertable creation fails. * #7390: Disable custom `hashagg` planner code. * #7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API. * #7486: Prevent building against PostgreSQL versions with broken ABI. * #7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default. * #7413: Add GUC for segmentwise recompression. **Bugfixes** * #7378: Remove obsolete job referencing `policy_job_error_retention`. * #7409: Update `bgw_job` table when altering procedure. * #7410: Fix the `aggregated compressed column not found` error on aggregation query. * #7426: Fix `datetime` parsing error in chunk constraint creation. * #7432: Verify that the heap tuple is valid before using. * #7434: Fix the segfault when internally setting the replica identity for a given chunk. * #7488: Emit error for transition table trigger on chunks. * #7514: Fix the error: `invalid child of chunk append`. * #7517: Fix the performance regression on the `cagg_migrate` procedure. * #7527: Restart scheduler on error. * #7557: Fix null handling for in-memory tuple filtering. * #7566: Improve transaction check in CAGG refresh. * #7584: Fix NaN-handling for vectorized aggregation. **Thanks** * @bharrisau for reporting the segfault when creating chunks. * @k-rus for suggesting that we add a hint when hypertable creation fails. * @pgloader for reporting the issue in an internal background job. * @staticlibs for sending the pull request that improves the transaction check in CAGG refresh. * @uasiddiqi for reporting the `aggregated compressed column not found` error.
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.18.0** * The ability to add secondary indexes to the columnstore through the new hypercore table access method. * Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore. * Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests. * Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression. **Dropping support for Bitnami images** After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha) **Deprecation Notice** We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below: | Deprecated | Replacement | Type | | --- | --- | --- | | decompress_chunk | convert_to_rowstore | Procedure | | compress_chunk | convert_to_columnstore | Procedure | | add_compression_policy | add_columnstore_policy | Function | | remove_compression_policy | remove_columnstore_policy | Function | | hypertable_compression_stats | hypertable_columnstore_stats | Function | | chunk_compression_stats | chunk_columnstore_stats | Function | | hypertable_compression_settings | hypertable_columnstore_settings | View | | chunk_compression_settings | chunk_columnstore_settings | View | | compression_settings | columnstore_settings | View | | timescaledb.compress | timescaledb.enable_columnstore | Parameter | | timescaledb.compress_segmentby | timescaledb.segmentby | Parameter | | timescaledb.compress_orderby | timescaledb.orderby | Parameter | **Features** * timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types). * timescale#7104: Hypercore table access method. * timescale#6901: Add hypertable support for transition tables. * timescale#7482: Optimize recompression of partially compressed chunks. * timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable. * timescale#7433: Add support for merging chunks. * timescale#7271: Push down `order by` in real-time continuous aggregate queries. * timescale#7455: Support `drop not null` on compressed hypertables. * timescale#7295: Support `alter table set access method` on hypertable. * timescale#7411: Change parameter name to enable hypercore table access method. * timescale#7436: Add index creation on `order by` columns. * timescale#7443: Add hypercore function and view aliases. * timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`. * timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases. * timescale#7565: Add hint when hypertable creation fails. * timescale#7390: Disable custom `hashagg` planner code. * timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API. * timescale#7486: Prevent building against PostgreSQL versions with broken ABI. * timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default. * timescale#7413: Add GUC for segmentwise recompression. **Bugfixes** * timescale#7378: Remove obsolete job referencing `policy_job_error_retention`. * timescale#7409: Update `bgw_job` table when altering procedure. * timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query. * timescale#7426: Fix `datetime` parsing error in chunk constraint creation. * timescale#7432: Verify that the heap tuple is valid before using. * timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk. * timescale#7488: Emit error for transition table trigger on chunks. * timescale#7514: Fix the error: `invalid child of chunk append`. * timescale#7517: Fix the performance regression on the `cagg_migrate` procedure. * timescale#7527: Restart scheduler on error. * timescale#7557: Fix null handling for in-memory tuple filtering. * timescale#7566: Improve transaction check in CAGG refresh. * timescale#7584: Fix NaN-handling for vectorized aggregation. * timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables. **Thanks** * @bharrisau for reporting the segfault when creating chunks. * @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables. * @k-rus for suggesting that we add a hint when hypertable creation fails. * @staticlibs for sending the pull request that improves the transaction check in CAGG refresh. * @uasiddiqi for reporting the `aggregated compressed column not found` error.
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.18.0** * The ability to add secondary indexes to the columnstore through the new hypercore table access method. * Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore. * Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests. * Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression. **Dropping support for Bitnami images** After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha) **Deprecation Notice** We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below: | Deprecated | Replacement | Type | | --- | --- | --- | | decompress_chunk | convert_to_rowstore | Procedure | | compress_chunk | convert_to_columnstore | Procedure | | add_compression_policy | add_columnstore_policy | Function | | remove_compression_policy | remove_columnstore_policy | Function | | hypertable_compression_stats | hypertable_columnstore_stats | Function | | chunk_compression_stats | chunk_columnstore_stats | Function | | hypertable_compression_settings | hypertable_columnstore_settings | View | | chunk_compression_settings | chunk_columnstore_settings | View | | compression_settings | columnstore_settings | View | | timescaledb.compress | timescaledb.enable_columnstore | Parameter | | timescaledb.compress_segmentby | timescaledb.segmentby | Parameter | | timescaledb.compress_orderby | timescaledb.orderby | Parameter | **Features** * timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types). * timescale#7104: Hypercore table access method. * timescale#6901: Add hypertable support for transition tables. * timescale#7482: Optimize recompression of partially compressed chunks. * timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable. * timescale#7433: Add support for merging chunks. * timescale#7271: Push down `order by` in real-time continuous aggregate queries. * timescale#7455: Support `drop not null` on compressed hypertables. * timescale#7295: Support `alter table set access method` on hypertable. * timescale#7411: Change parameter name to enable hypercore table access method. * timescale#7436: Add index creation on `order by` columns. * timescale#7443: Add hypercore function and view aliases. * timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`. * timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases. * timescale#7565: Add hint when hypertable creation fails. * timescale#7390: Disable custom `hashagg` planner code. * timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API. * timescale#7486: Prevent building against PostgreSQL versions with broken ABI. * timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default. * timescale#7413: Add GUC for segmentwise recompression. **Bugfixes** * timescale#7378: Remove obsolete job referencing `policy_job_error_retention`. * timescale#7409: Update `bgw_job` table when altering procedure. * timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query. * timescale#7426: Fix `datetime` parsing error in chunk constraint creation. * timescale#7432: Verify that the heap tuple is valid before using. * timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk. * timescale#7488: Emit error for transition table trigger on chunks. * timescale#7514: Fix the error: `invalid child of chunk append`. * timescale#7517: Fix the performance regression on the `cagg_migrate` procedure. * timescale#7527: Restart scheduler on error. * timescale#7557: Fix null handling for in-memory tuple filtering. * timescale#7566: Improve transaction check in CAGG refresh. * timescale#7584: Fix NaN-handling for vectorized aggregation. * timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables. **Thanks** * @bharrisau for reporting the segfault when creating chunks. * @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables. * @k-rus for suggesting that we add a hint when hypertable creation fails. * @staticlibs for sending the pull request that improves the transaction check in CAGG refresh. * @uasiddiqi for reporting the `aggregated compressed column not found` error.
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.18.0** * The ability to add secondary indexes to the columnstore through the new hypercore table access method. * Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore. * Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests. * Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression. **Dropping support for Bitnami images** After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha) **Deprecation Notice** We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below: | Deprecated | Replacement | Type | | --- | --- | --- | | decompress_chunk | convert_to_rowstore | Procedure | | compress_chunk | convert_to_columnstore | Procedure | | add_compression_policy | add_columnstore_policy | Function | | remove_compression_policy | remove_columnstore_policy | Function | | hypertable_compression_stats | hypertable_columnstore_stats | Function | | chunk_compression_stats | chunk_columnstore_stats | Function | | hypertable_compression_settings | hypertable_columnstore_settings | View | | chunk_compression_settings | chunk_columnstore_settings | View | | compression_settings | columnstore_settings | View | | timescaledb.compress | timescaledb.enable_columnstore | Parameter | | timescaledb.compress_segmentby | timescaledb.segmentby | Parameter | | timescaledb.compress_orderby | timescaledb.orderby | Parameter | **Features** * timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types). * timescale#7104: Hypercore table access method. * timescale#6901: Add hypertable support for transition tables. * timescale#7482: Optimize recompression of partially compressed chunks. * timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable. * timescale#7433: Add support for merging chunks. * timescale#7271: Push down `order by` in real-time continuous aggregate queries. * timescale#7455: Support `drop not null` on compressed hypertables. * timescale#7295: Support `alter table set access method` on hypertable. * timescale#7411: Change parameter name to enable hypercore table access method. * timescale#7436: Add index creation on `order by` columns. * timescale#7443: Add hypercore function and view aliases. * timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`. * timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases. * timescale#7565: Add hint when hypertable creation fails. * timescale#7390: Disable custom `hashagg` planner code. * timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API. * timescale#7486: Prevent building against PostgreSQL versions with broken ABI. * timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default. * timescale#7413: Add GUC for segmentwise recompression. **Bugfixes** * timescale#7378: Remove obsolete job referencing `policy_job_error_retention`. * timescale#7409: Update `bgw_job` table when altering procedure. * timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query. * timescale#7426: Fix `datetime` parsing error in chunk constraint creation. * timescale#7432: Verify that the heap tuple is valid before using. * timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk. * timescale#7488: Emit error for transition table trigger on chunks. * timescale#7514: Fix the error: `invalid child of chunk append`. * timescale#7517: Fix the performance regression on the `cagg_migrate` procedure. * timescale#7527: Restart scheduler on error. * timescale#7557: Fix null handling for in-memory tuple filtering. * timescale#7566: Improve transaction check in CAGG refresh. * timescale#7584: Fix NaN-handling for vectorized aggregation. * timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables. **Thanks** * @bharrisau for reporting the segfault when creating chunks. * @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables. * @k-rus for suggesting that we add a hint when hypertable creation fails. * @staticlibs for sending the pull request that improves the transaction check in CAGG refresh. * @uasiddiqi for reporting the `aggregated compressed column not found` error.
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.18.0** * The ability to add secondary indexes to the columnstore through the new hypercore table access method. * Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore. * Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests. * Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression. **Dropping support for Bitnami images** After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha) **Deprecation Notice** We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below: | Deprecated | Replacement | Type | | --- | --- | --- | | decompress_chunk | convert_to_rowstore | Procedure | | compress_chunk | convert_to_columnstore | Procedure | | add_compression_policy | add_columnstore_policy | Function | | remove_compression_policy | remove_columnstore_policy | Function | | hypertable_compression_stats | hypertable_columnstore_stats | Function | | chunk_compression_stats | chunk_columnstore_stats | Function | | hypertable_compression_settings | hypertable_columnstore_settings | View | | chunk_compression_settings | chunk_columnstore_settings | View | | compression_settings | columnstore_settings | View | | timescaledb.compress | timescaledb.enable_columnstore | Parameter | | timescaledb.compress_segmentby | timescaledb.segmentby | Parameter | | timescaledb.compress_orderby | timescaledb.orderby | Parameter | **Features** * timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types). * timescale#7104: Hypercore table access method. * timescale#6901: Add hypertable support for transition tables. * timescale#7482: Optimize recompression of partially compressed chunks. * timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable. * timescale#7433: Add support for merging chunks. * timescale#7271: Push down `order by` in real-time continuous aggregate queries. * timescale#7455: Support `drop not null` on compressed hypertables. * timescale#7295: Support `alter table set access method` on hypertable. * timescale#7411: Change parameter name to enable hypercore table access method. * timescale#7436: Add index creation on `order by` columns. * timescale#7443: Add hypercore function and view aliases. * timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`. * timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases. * timescale#7565: Add hint when hypertable creation fails. * timescale#7390: Disable custom `hashagg` planner code. * timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API. * timescale#7486: Prevent building against PostgreSQL versions with broken ABI. * timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default. * timescale#7413: Add GUC for segmentwise recompression. **Bugfixes** * timescale#7378: Remove obsolete job referencing `policy_job_error_retention`. * timescale#7409: Update `bgw_job` table when altering procedure. * timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query. * timescale#7426: Fix `datetime` parsing error in chunk constraint creation. * timescale#7432: Verify that the heap tuple is valid before using. * timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk. * timescale#7488: Emit error for transition table trigger on chunks. * timescale#7514: Fix the error: `invalid child of chunk append`. * timescale#7517: Fix the performance regression on the `cagg_migrate` procedure. * timescale#7527: Restart scheduler on error. * timescale#7557: Fix null handling for in-memory tuple filtering. * timescale#7566: Improve transaction check in CAGG refresh. * timescale#7584: Fix NaN-handling for vectorized aggregation. * timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables. **Thanks** * @bharrisau for reporting the segfault when creating chunks. * @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables. * @k-rus for suggesting that we add a hint when hypertable creation fails. * @staticlibs for sending the pull request that improves the transaction check in CAGG refresh. * @uasiddiqi for reporting the `aggregated compressed column not found` error.
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.18.0** * The ability to add secondary indexes to the columnstore through the new hypercore table access method. * Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore. * Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests. * Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression. **Dropping support for Bitnami images** After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha) **Deprecation Notice** We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below: | Deprecated | Replacement | Type | | --- | --- | --- | | decompress_chunk | convert_to_rowstore | Procedure | | compress_chunk | convert_to_columnstore | Procedure | | add_compression_policy | add_columnstore_policy | Function | | remove_compression_policy | remove_columnstore_policy | Function | | hypertable_compression_stats | hypertable_columnstore_stats | Function | | chunk_compression_stats | chunk_columnstore_stats | Function | | hypertable_compression_settings | hypertable_columnstore_settings | View | | chunk_compression_settings | chunk_columnstore_settings | View | | compression_settings | columnstore_settings | View | | timescaledb.compress | timescaledb.enable_columnstore | Parameter | | timescaledb.compress_segmentby | timescaledb.segmentby | Parameter | | timescaledb.compress_orderby | timescaledb.orderby | Parameter | **Features** * timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types). * timescale#7104: Hypercore table access method. * timescale#6901: Add hypertable support for transition tables. * timescale#7482: Optimize recompression of partially compressed chunks. * timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable. * timescale#7433: Add support for merging chunks. * timescale#7271: Push down `order by` in real-time continuous aggregate queries. * timescale#7455: Support `drop not null` on compressed hypertables. * timescale#7295: Support `alter table set access method` on hypertable. * timescale#7411: Change parameter name to enable hypercore table access method. * timescale#7436: Add index creation on `order by` columns. * timescale#7443: Add hypercore function and view aliases. * timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`. * timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases. * timescale#7565: Add hint when hypertable creation fails. * timescale#7390: Disable custom `hashagg` planner code. * timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API. * timescale#7486: Prevent building against PostgreSQL versions with broken ABI. * timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default. * timescale#7413: Add GUC for segmentwise recompression. **Bugfixes** * timescale#7378: Remove obsolete job referencing `policy_job_error_retention`. * timescale#7409: Update `bgw_job` table when altering procedure. * timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query. * timescale#7426: Fix `datetime` parsing error in chunk constraint creation. * timescale#7432: Verify that the heap tuple is valid before using. * timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk. * timescale#7488: Emit error for transition table trigger on chunks. * timescale#7514: Fix the error: `invalid child of chunk append`. * timescale#7517: Fix the performance regression on the `cagg_migrate` procedure. * timescale#7527: Restart scheduler on error. * timescale#7557: Fix null handling for in-memory tuple filtering. * timescale#7566: Improve transaction check in CAGG refresh. * timescale#7584: Fix NaN-handling for vectorized aggregation. * timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables. **Thanks** * @bharrisau for reporting the segfault when creating chunks. * @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables. * @k-rus for suggesting that we add a hint when hypertable creation fails. * @staticlibs for sending the pull request that improves the transaction check in CAGG refresh. * @uasiddiqi for reporting the `aggregated compressed column not found` error.
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.18.0** * The ability to add secondary indexes to the columnstore through the new hypercore table access method. * Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore. * Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests. * Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression. **Dropping support for Bitnami images** After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha) **Deprecation Notice** We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below: | Deprecated | Replacement | Type | | --- | --- | --- | | decompress_chunk | convert_to_rowstore | Procedure | | compress_chunk | convert_to_columnstore | Procedure | | add_compression_policy | add_columnstore_policy | Function | | remove_compression_policy | remove_columnstore_policy | Function | | hypertable_compression_stats | hypertable_columnstore_stats | Function | | chunk_compression_stats | chunk_columnstore_stats | Function | | hypertable_compression_settings | hypertable_columnstore_settings | View | | chunk_compression_settings | chunk_columnstore_settings | View | | compression_settings | columnstore_settings | View | | timescaledb.compress | timescaledb.enable_columnstore | Parameter | | timescaledb.compress_segmentby | timescaledb.segmentby | Parameter | | timescaledb.compress_orderby | timescaledb.orderby | Parameter | **Features** * timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types). * timescale#7104: Hypercore table access method. * timescale#6901: Add hypertable support for transition tables. * timescale#7482: Optimize recompression of partially compressed chunks. * timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable. * timescale#7433: Add support for merging chunks. * timescale#7271: Push down `order by` in real-time continuous aggregate queries. * timescale#7455: Support `drop not null` on compressed hypertables. * timescale#7295: Support `alter table set access method` on hypertable. * timescale#7411: Change parameter name to enable hypercore table access method. * timescale#7436: Add index creation on `order by` columns. * timescale#7443: Add hypercore function and view aliases. * timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`. * timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases. * timescale#7565: Add hint when hypertable creation fails. * timescale#7390: Disable custom `hashagg` planner code. * timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API. * timescale#7486: Prevent building against PostgreSQL versions with broken ABI. * timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default. * timescale#7413: Add GUC for segmentwise recompression. **Bugfixes** * timescale#7378: Remove obsolete job referencing `policy_job_error_retention`. * timescale#7409: Update `bgw_job` table when altering procedure. * timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query. * timescale#7426: Fix `datetime` parsing error in chunk constraint creation. * timescale#7432: Verify that the heap tuple is valid before using. * timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk. * timescale#7488: Emit error for transition table trigger on chunks. * timescale#7514: Fix the error: `invalid child of chunk append`. * timescale#7517: Fix the performance regression on the `cagg_migrate` procedure. * timescale#7527: Restart scheduler on error. * timescale#7557: Fix null handling for in-memory tuple filtering. * timescale#7566: Improve transaction check in CAGG refresh. * timescale#7584: Fix NaN-handling for vectorized aggregation. * timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables. **Thanks** * @bharrisau for reporting the segfault when creating chunks. * @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables. * @k-rus for suggesting that we add a hint when hypertable creation fails. * @staticlibs for sending the pull request that improves the transaction check in CAGG refresh. * @uasiddiqi for reporting the `aggregated compressed column not found` error.
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.18.0** * The ability to add secondary indexes to the columnstore through the new hypercore table access method. * Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore. * Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests. * Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression. **Dropping support for Bitnami images** After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha) **Deprecation Notice** We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below: | Deprecated | Replacement | Type | | --- | --- | --- | | decompress_chunk | convert_to_rowstore | Procedure | | compress_chunk | convert_to_columnstore | Procedure | | add_compression_policy | add_columnstore_policy | Function | | remove_compression_policy | remove_columnstore_policy | Function | | hypertable_compression_stats | hypertable_columnstore_stats | Function | | chunk_compression_stats | chunk_columnstore_stats | Function | | hypertable_compression_settings | hypertable_columnstore_settings | View | | chunk_compression_settings | chunk_columnstore_settings | View | | compression_settings | columnstore_settings | View | | timescaledb.compress | timescaledb.enable_columnstore | Parameter | | timescaledb.compress_segmentby | timescaledb.segmentby | Parameter | | timescaledb.compress_orderby | timescaledb.orderby | Parameter | **Features** * #7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types). * #7104: Hypercore table access method. * #6901: Add hypertable support for transition tables. * #7482: Optimize recompression of partially compressed chunks. * #7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable. * #7433: Add support for merging chunks. * #7271: Push down `order by` in real-time continuous aggregate queries. * #7455: Support `drop not null` on compressed hypertables. * #7295: Support `alter table set access method` on hypertable. * #7411: Change parameter name to enable hypercore table access method. * #7436: Add index creation on `order by` columns. * #7443: Add hypercore function and view aliases. * #7521: Add optional `force` argument to `refresh_continuous_aggregate`. * #7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases. * #7565: Add hint when hypertable creation fails. * #7390: Disable custom `hashagg` planner code. * #7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API. * #7486: Prevent building against PostgreSQL versions with broken ABI. * #7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default. * #7413: Add GUC for segmentwise recompression. **Bugfixes** * #7378: Remove obsolete job referencing `policy_job_error_retention`. * #7409: Update `bgw_job` table when altering procedure. * #7410: Fix the `aggregated compressed column not found` error on aggregation query. * #7426: Fix `datetime` parsing error in chunk constraint creation. * #7432: Verify that the heap tuple is valid before using. * #7434: Fix the segfault when internally setting the replica identity for a given chunk. * #7488: Emit error for transition table trigger on chunks. * #7514: Fix the error: `invalid child of chunk append`. * #7517: Fix the performance regression on the `cagg_migrate` procedure. * #7527: Restart scheduler on error. * #7557: Fix null handling for in-memory tuple filtering. * #7566: Improve transaction check in CAGG refresh. * #7584: Fix NaN-handling for vectorized aggregation. * #7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables. **Thanks** * @bharrisau for reporting the segfault when creating chunks. * @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables. * @k-rus for suggesting that we add a hint when hypertable creation fails. * @staticlibs for sending the pull request that improves the transaction check in CAGG refresh. * @uasiddiqi for reporting the `aggregated compressed column not found` error.
I tested this on a hypertable with compression enabled, using a B-tree index on two columns, and observed that the index was not utilized |
@pantonis I am not sure what you did, but here is an example based on the tests in the commit. Unfortunately, we do not have documentation yet, but that is coming. Load the extension and disable columnar scan (it is a little too efficient, so for an example table this small it will be used). Expanded display is used automatically.
Null display is "[NULL]".
SET
psql (17.2 (Ubuntu 17.2-1.pgdg24.04+1), server 16.6 (Ubuntu 16.6-1.pgdg24.04+1))
Type "help" for help.
demo_hypercore=# create extension timescaledb;
CREATE EXTENSION
demo_hypercore=# \dx
List of installed extensions
Name | Version | Schema | Description
-------------+---------+------------+---------------------------------------------------------------------------------------
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
timescaledb | 2.18.0 | public | Enables scalable inserts and complex queries for time-series data (Community Edition)
(2 rows)
demo_hypercore=# set timescaledb.enable_columnarscan to false;
SET Create a hypertable: demo_hypercore=# create table readings(
metric_id serial,
created_at timestamptz not null unique,
location_id smallint,
owner_id bigint,
device_id bigint,
temp float8,
humidity float4
);
CREATE TABLE
demo_hypercore=# select create_hypertable('readings', by_range('created_at'));
create_hypertable
-------------------
(1,t)
(1 row) Set compression parameters and set the default access method for the chunks to demo_hypercore=# alter table readings
set (timescaledb.compress_orderby = 'created_at',
timescaledb.compress_segmentby = 'location_id');
ALTER TABLE
demo_hypercore=# alter table readings set access method hypercore;
ALTER TABLE Insert some data: demo_hypercore=# insert into readings (created_at, location_id, device_id, owner_id, temp, humidity)
select t, ceil(random()*10), ceil(random()*30), ceil(random() * 5), random()*40, random()*100
from generate_series('2022-06-01'::timestamptz, '2022-07-01', '1s') t;
INSERT 0 2592001 All chunks are now using the demo_hypercore=# select * from chunk_info where hypertable = 'readings'::regclass;
hypertable | chunk | amname
------------+-----------------------------------------+-----------
readings | _timescaledb_internal._hyper_3_13_chunk | hypercore
readings | _timescaledb_internal._hyper_3_15_chunk | hypercore
readings | _timescaledb_internal._hyper_3_17_chunk | hypercore
readings | _timescaledb_internal._hyper_3_19_chunk | hypercore
readings | _timescaledb_internal._hyper_3_21_chunk | hypercore
readings | _timescaledb_internal._hyper_3_23_chunk | hypercore
(6 rows) Compress the chunks (note that the new function demo_hypercore=# select compress_chunk(show_chunks('readings'));
compress_chunk
-----------------------------------------
_timescaledb_internal._hyper_3_13_chunk
_timescaledb_internal._hyper_3_15_chunk
_timescaledb_internal._hyper_3_17_chunk
_timescaledb_internal._hyper_3_19_chunk
_timescaledb_internal._hyper_3_21_chunk
_timescaledb_internal._hyper_3_23_chunk
(6 rows) Test a query that does not use an index scan demo_hypercore=# explain (analyze, buffers) select * from readings where metric_id = 4711;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..42976.60 rows=12960 width=42) (actual time=13.428..72.719 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 1
Buffers: shared hit=5311 read=29429
-> Parallel Append (cost=0.00..40680.60 rows=5400 width=42) (actual time=35.454..64.195 rows=0 loops=2)
Buffers: shared hit=5311 read=29429
-> Parallel Seq Scan on _hyper_3_15_chunk (cost=0.00..9399.00 rows=1260 width=42) (actual time=27.661..27.661 rows=0 loops=1)
Filter: (metric_id = 4711)
Rows Removed by Filter: 604800
Buffers: shared hit=1246 read=6883
-> Parallel Seq Scan on _hyper_3_17_chunk (cost=0.00..9399.00 rows=1260 width=42) (actual time=27.898..27.898 rows=0 loops=1)
Filter: (metric_id = 4711)
Rows Removed by Filter: 604800
Buffers: shared hit=1241 read=6861
-> Parallel Seq Scan on _hyper_3_19_chunk (cost=0.00..9399.00 rows=1260 width=42) (actual time=14.256..14.256 rows=0 loops=2)
Filter: (metric_id = 4711)
Rows Removed by Filter: 302400
Buffers: shared hit=1218 read=6869
-> Parallel Seq Scan on _hyper_3_21_chunk (cost=0.00..9399.00 rows=1260 width=42) (actual time=27.539..27.539 rows=0 loops=1)
Filter: (metric_id = 4711)
Rows Removed by Filter: 604800
Buffers: shared hit=1239 read=6848
-> Parallel Seq Scan on _hyper_3_13_chunk (cost=0.00..1656.24 rows=275 width=42) (actual time=0.355..4.308 rows=1 loops=1)
Filter: (metric_id = 4711)
Rows Removed by Filter: 93599
Buffers: shared hit=204 read=1072
-> Parallel Seq Scan on _hyper_3_23_chunk (cost=0.00..1401.36 rows=233 width=42) (actual time=12.458..12.458 rows=0 loops=1)
Filter: (metric_id = 4711)
Rows Removed by Filter: 79201
Buffers: shared hit=163 read=896
Planning:
Buffers: shared hit=43 read=10
Planning Time: 0.596 ms
Execution Time: 72.785 ms
(34 rows) Add an index for demo_hypercore=# create index my_index on readings (metric_id);
CREATE INDEX Run the query again and see that the index scan is used: demo_hypercore=# explain (analyze, buffers) select * from readings where metric_id = 4711;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.29..42197.89 rows=12960 width=42) (actual time=0.011..0.044 rows=1 loops=1)
Buffers: shared read=17 written=1
-> Index Scan using _hyper_3_13_chunk_my_index on _hyper_3_13_chunk (cost=0.29..1524.48 rows=468 width=42) (actual time=0.010..0.011 rows=1 loops=1)
Index Cond: (metric_id = 4711)
Buffers: shared read=3
-> Index Scan using _hyper_3_15_chunk_my_index on _hyper_3_15_chunk (cost=0.42..9829.34 rows=3024 width=42) (actual time=0.008..0.008 rows=0 loops=1)
Index Cond: (metric_id = 4711)
Buffers: shared read=3 written=1
-> Index Scan using _hyper_3_17_chunk_my_index on _hyper_3_17_chunk (cost=0.42..9829.34 rows=3024 width=42) (actual time=0.007..0.007 rows=0 loops=1)
Index Cond: (metric_id = 4711)
Buffers: shared read=3
-> Index Scan using _hyper_3_19_chunk_my_index on _hyper_3_19_chunk (cost=0.42..9829.34 rows=3024 width=42) (actual time=0.007..0.007 rows=0 loops=1)
Index Cond: (metric_id = 4711)
Buffers: shared read=3
-> Index Scan using _hyper_3_21_chunk_my_index on _hyper_3_21_chunk (cost=0.42..9829.34 rows=3024 width=42) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: (metric_id = 4711)
Buffers: shared read=3
-> Index Scan using _hyper_3_23_chunk_my_index on _hyper_3_23_chunk (cost=0.29..1291.22 rows=396 width=42) (actual time=0.004..0.004 rows=0 loops=1)
Index Cond: (metric_id = 4711)
Buffers: shared read=2
Planning:
Buffers: shared hit=227 read=72 dirtied=8 written=6
Planning Time: 0.530 ms
Execution Time: 0.068 ms
(24 rows) |
@mkindahl thanks for the example! I was looking for documentation but this is just as good! I would like your opinion over the following scenario. Consider a hypertable (around 1TB over a couple billion rows) on which I want to run
Thank you so much! |
Ah, unfortunately, I must semi-confirm what @pantonis is reporting. I've tried your example line by line and
|
Did you use my example literally, or did you use something else?
Here is the view: create view chunk_info as
select inh.inhparent::regclass as hypertable,
cl.oid::regclass as chunk,
am.amname
from pg_class cl
join pg_am am on cl.relam = am.oid
join pg_inherits inh on inh.inhrelid = cl.oid;
You can add an index on a segment-by column, but it indexes the compressed data directly. |
Data inserted is added to the uncompressed region, so not automatically compressed as you insert.
Updating the indexes while you insert is likely to increase the total time and I/O on the operation, so if you have that option, rebuilding the indexes afterwards probably have a lower total execution time and lower total I/O, but you would have to measure to make sure since it depends on a lot on factors like the amount of memory, speed of disk, etc.
If you don't have an index, a filter will be used: mats=# explain select * from readings where location_id = 10;
QUERY PLAN
--------------------------------------------------------------------------------
Append (cost=0.00..59446.69 rows=256535 width=42)
-> Seq Scan on _hyper_5_19_chunk (cost=0.00..1173.00 rows=9382 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_21_chunk (cost=0.00..13796.00 rows=59391 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_23_chunk (cost=0.00..13796.00 rows=57960 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_25_chunk (cost=0.00..13796.00 rows=61831 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_27_chunk (cost=0.00..13796.00 rows=60077 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_29_chunk (cost=0.00..1807.01 rows=7894 width=42)
Filter: (location_id = 10)
(13 rows) If you add an index, it will be more compact. Note that for small tables, sequence scan is very efficient in itself, so cost estimate will be lower, so to get an index scan you need to disable sequence scan (which actually does not disable it, just makes it more expensive, so reducing the likelihood that it is chosen). mats=# create index on readings (location_id);
CREATE INDEX
mats=# explain select * from readings where location_id = 10;
QUERY PLAN
--------------------------------------------------------------------------------
Append (cost=0.00..60422.68 rows=256535 width=42)
-> Seq Scan on _hyper_5_19_chunk (cost=0.00..2148.99 rows=9382 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_21_chunk (cost=0.00..13796.00 rows=59391 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_23_chunk (cost=0.00..13796.00 rows=57960 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_25_chunk (cost=0.00..13796.00 rows=61831 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_27_chunk (cost=0.00..13796.00 rows=60077 width=42)
Filter: (location_id = 10)
-> Seq Scan on _hyper_5_29_chunk (cost=0.00..1807.01 rows=7894 width=42)
Filter: (location_id = 10)
(13 rows)
mats=# set enable_seqscan to false;
SET
mats=# explain select * from readings where location_id = 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.17..112532.23 rows=256535 width=42)
-> Index Scan using _hyper_5_19_chunk_readings_location_id_idx on _hyper_5_19_chunk (cost=0.17..4043.17 rows=9382 width=42)
Index Cond: (location_id = 10)
-> Index Scan using _hyper_5_21_chunk_readings_location_id_idx on _hyper_5_21_chunk (cost=0.42..25950.81 rows=59391 width=42)
Index Cond: (location_id = 10)
-> Index Scan using _hyper_5_23_chunk_readings_location_id_idx on _hyper_5_23_chunk (cost=0.42..25899.86 rows=57960 width=42)
Index Cond: (location_id = 10)
-> Index Scan using _hyper_5_25_chunk_readings_location_id_idx on _hyper_5_25_chunk (cost=0.42..26005.53 rows=61831 width=42)
Index Cond: (location_id = 10)
-> Index Scan using _hyper_5_27_chunk_readings_location_id_idx on _hyper_5_27_chunk (cost=0.42..25945.02 rows=60077 width=42)
Index Cond: (location_id = 10)
-> Index Scan using _hyper_5_29_chunk_readings_location_id_idx on _hyper_5_29_chunk (cost=0.29..3405.16 rows=7894 width=42)
Index Cond: (location_id = 10)
JIT:
Functions: 6
Options: Inlining false, Optimization false, Expressions true, Deforming true
(16 rows)
This is impossible to decide without testing it out:
|
I created my hypertable, create the btree index, fill table with data. Next morning (not compressed manually) I can see that all chunks are compressed. this is my index
When I run the following query
I get the following execution plan where I can see that the btree index is not used
|
@mkindahl thank you so much for your feedback! @pantonis I should mention this, I am running pg16.6 and tsdb 2.18.0. How about you? Is there a chance this new functionality is for pg17 only?
Yes, line for line as I said. I will try a few more times, then put up a new issue if I can't get the index to work. |
@jflambert same here pg16.6 with timescaledb 2.18.0 |
@mkindahl false alert I guess? I spent two hours on this yesterday, no success on indexes. This morning I press F5 in pgadmin without changing anything and suddenly indexes kicked in. I've then recreated my test setup several times and indexes always work immediately. I'll put this in production and I'll let you know if I have any other issues. Thanks! |
@pantonis to be precise, we see the index being used in the uncompressed chunk (search for but yeah you'd expect it to be used for compressed chunks as well, unless the scan is just more efficient (but I doubt that's the situation here) could you try with |
Still the same. |
Tests are from PG15 and upwards.
Good that you got it to work, but it's weird that you had it in the first place.
One thing that might affect the situation is if you do not have up-to-date stats. This could also be explained by it suddenly working. You could try to run |
The compressed chunk is "internal" when using the TAM, similar to how TOAST tables are internal. It is being used, but indirectly through the TAM API. For @pantonis it looks more like the TAM is not used at all. Check with this view: create view chunk_info as
select inh.inhparent::regclass as hypertable,
cl.oid::regclass as chunk,
am.amname
from pg_class cl
join pg_am am on cl.relam = am.oid
join pg_inherits inh on inh.inhrelid = cl.oid; And do something like: select * from chunk_info where hypertable = 'dw."Order"'::regclass |
What shall I check on that view? |
I think he's interested in knowing if hypercore is the access method. |
I see 18 chunks like the above |
@pantonis This means you're not using the hypercore access method and this is the reason to why you do not get any index scans. If you try something like this on the chunk I can see: alter table _timescaledb_internal._hyper_4_1_chunk set access method hypercore; And then try a query that will touch that chunk, you should hopefully see an index scan on that chunk. You can verify it with the view. |
will give it a try later. but may I ask do I have to run this for every chunk that get's created? Any documentiaton about it? |
@pantonis I don't think there's any documentation yet no. If you check this example (third code block), this line applies access method hypercore to the entire table (not just a chunk)
I feel that's what's missing for you. And if not, do also try to update the stats with |
This wraps our existing compression solution in the table access method API, effectively turning it into a columnar storage with compression, allowing several features normally available to PostgreSQL tables to be available on tables using TimescaleDB compression, for example:
SELECT FOR UPDATE
(for example) will properly lock uncompressed and compressed tuples.CLUSTER
andVACUUM FULL
compresses the table before running vacuum.Disable-check: commit-count
Disable-check: force-changelog-file
A changelog entry will be added in a follow-up PR.