timescale · mkindahl · Nov 1, 2023 · Aug 30, 2023 · Aug 30, 2023 · Aug 30, 2023
diff --git a/use-timescale/compression/about-compression.md b/use-timescale/compression/about-compression.md
@@ -21,86 +21,6 @@ This section explains how to enable native compression, and then goes into
 detail on the most important settings for compression, to help you get the
 best possible compression ratio.
 
-For more information about compressing chunks, see [manual compression][manual-compression].
-
-## Enable compression
-
-You can enable compression on individual hypertables, by declaring which column
-you want to segment by. This procedure uses an example table, called `example`,
-and segments it by the `device_id` column. Every chunk that is more than seven
-days old is then marked to be automatically compressed.
-
-|time|device_id|cpu|disk_io|energy_consumption|
-|-|-|-|-|-|
-|8/22/2019 0:00|1|88.2|20|0.8|
-|8/22/2019 0:05|2|300.5|30|0.9|
-
-<Procedure>
-
-### Enabling compression
-
-1.  At the `psql` prompt, alter the table:
-
-    ```sql
-    ALTER TABLE example SET (
-      timescaledb.compress,
-      timescaledb.compress_segmentby = 'device_id'
-    );
-    ```
-
-1.  Add a compression policy to compress chunks that are older than seven days:
-
-    ```sql
-    SELECT add_compression_policy('example', INTERVAL '7 days');
-    ```
-
-</Procedure>
-
-For more information, see the API reference for
-[`ALTER TABLE (compression)`][alter-table-compression] and
-[`add_compression_policy`][add_compression_policy].
-
-You can also set a compression policy through
-the Timescale console. The compression tool automatically generates and
-runs the compression commands for you. To learn more, see the
-[Timescale documentation](/use-timescale/latest/services/service-explorer/#setting-a-compression-policy-from-timescale-cloud-console).
-
-## View current compression policy
-
-To view the compression policy that you've set:
-
-```sql
-SELECT * FROM timescaledb_information.jobs
-  WHERE proc_name='policy_compression';
-```
-
-For more information, see the API reference for [`timescaledb_information.jobs`][timescaledb_information-jobs].
-
-## Remove compression policy
-
-To remove a compression policy, use `remove_compression_policy`. For example, to
-remove a compression policy for a hypertable named `cpu`:
-
-```sql
-SELECT remove_compression_policy('cpu');
-```
-
-For more information, see the API reference for
-[`remove_compression_policy`][remove_compression_policy].
-
-## Disable compression
-
-You can disable compression entirely on individual hypertables. This command
-works only if you don't currently have any compressed chunks:
-
-```sql
-ALTER TABLE <TABLE_NAME> SET (timescaledb.compress=false);
-```
-
-If your hypertable contains compressed chunks, you need to
-[decompress each chunk][decompress-chunks] individually before you can disable
-compression.
-
 ## Compression policy intervals
 
 Data is usually compressed after an interval of time, and not
@@ -275,9 +195,5 @@ chunks. When you do this,the data that is being inserted is not compressed
 immediately. It is stored alongside the chunk it has been inserted into, and
 then a separate job merges it with the chunk and compresses it later on.
 
-[alter-table-compression]: /api/:currentVersion:/compression/alter_table_compression/
-[add_compression_policy]: /api/:currentVersion:/compression/add_compression_policy/
 [decompress-chunks]: /use-timescale/:currentVersion:/compression/decompress-chunks
-[remove_compression_policy]: /api/:currentVersion:/compression/remove_compression_policy/
-[timescaledb_information-jobs]: /api/:currentVersion:/informational-views/jobs/
 [manual-compression]: /use-timescale/:currentVersion:/compression/manual-compression/
diff --git a/use-timescale/compression/compression-design.md b/use-timescale/compression/compression-design.md
@@ -0,0 +1,128 @@
+---
+title: Designing your database for compression
+excerpt: Learn how to design your database for the most effective compression
+products: [cloud, mst, self_hosted]
+keywords: [compression, schema, tables]
+---
+
+# Designing for compression
+
+Time-series data can be unique, in that it needs to handle both shallow and wide
+queries, such as "What's happened across the deployment in the last 10 minutes,"
+and deep and narrow, such as "What is the average CPU usage for this server
+over the last 24 hours." Time-series data usually has a very high rate of
+inserts as well; hundreds of thousands of writes per second can be very normal
+for a time-series dataset. Additionally, time-series data is often very
+granular, and data is collected at a higher resolution than many other
+datasets. This can result in terabytes of data being collected over time.
+
+All this means that if you need great compression rates, you probably need to
+consider the design of your database, before you start ingesting data. This
+section covers some of the things you need to take into consideration when
+designing your database for maximum compression effectiveness.
+
+## Compressing data
+
+TimescaleDB is built on PostgreSQL which is, by nature, a row-based database.
+Because time-series data is accessed in order of time, when you enable
+compression, TimescaleDB converts many wide rows of data into a single row of
+data, called an array form. This means that each field of that new, wide row
+stores an ordered set of data comprising the entire column.
+
+For example, if you had a table with data that looked a bit like this:
+
+|Timestamp|Device ID|Status Code|Temperature|
+|-|-|-|-|
+|12:00:01|A|0|70.11|
+|12:00:01|B|0|69.70|
+|12:00:02|A|0|70.12|
+|12:00:02|B|0|69.69|
+|12:00:03|A|0|70.14|
+|12:00:03|B|4|69.70|
+
+You can convert this to a single row in array form, like this:
+
+|Timestamp|Device ID|Status Code|Temperature|
+|-|-|-|-|
+|[12:00:01, 12:00:01, 12:00:02, 12:00:02, 12:00:03, 12:00:03]|[A, B, A, B, A, B]|[0, 0, 0, 0, 0, 4]|[70.11, 69.70, 70.12, 69.69, 70.14, 69.70]|
+
+Even before you compress any data, this format immediately saves storage by
+reducing the per-row overhead. PostgreSQL typically adds a small number of bytes
+of overhead per row. So even without any compression, the schema in this example
+is now smaller on disk than the previous format.
+
+This format arranges the data so that similar data, such as timestamps, device
+IDs, or temperature readings, is stored contiguously. This means that you can
+then use type-specific compression algorithms to compress the data further, and
+each array is separately compressed. For more information about the compression
+methods used, see the [compression methods section][compression-methods].
+
+When the data is in array format, you can perform queries that require a subset
+of the columns very quickly. For example, if you have a query like this one, that
+asks for the average temperature over the past day:
+
+<CodeBlock canCopy={false} showLineNumbers={false} children={`
+SELECT time_bucket(‘1 minute’, timestamp) as minute
+ AVG(temperature)
+FROM table
+WHERE timestamp > now() - interval ‘1 day’
+ORDER BY minute DESC
+GROUP BY minute;
+`} />
+
+The query engine can fetch and decompress only the timestamp and temperature
+columns to efficiently compute and return these results.
+
+Finally, TimescaleDB uses non-inline disk pages to store the compressed arrays.
+This means that the in-row data points to a secondary disk page that stores the
+compressed array, and the actual row in the main table becomes very small,
+because it is now just pointers to the data. When data stored like this is
+queried, only the compressed arrays for the required columns are read from disk,
+further improving performance by reducing disk reads and writes.
+
+## Querying compressed data
+
+In the previous example, the database has no way of knowing which rows need to
+be fetched and decompressed to resolve a query. For example, the database can't
+easily determine which rows contain data from the past day, as the timestamp
+itself is in a compressed column. You don't want to have to decompress all the
+data in a chunk, or even an entire hypertable, to determine which rows are
+required.
+
+TimescaleDB automatically includes more information in the row and includes
+additional groupings to improve query performance. When you compress a
+hypertable, either manually or through a compression policy, it can help to specify
+an `ORDER BY` column.
+
+`ORDER BY` columns specify how the rows that are part of a compressed batch are
+ordered. For most time-series workloads, this is by timestamp, so if you don't
+specify an `ORDER BY` column, TimescaleDB defaults to using the time column. You
+can also specify additional dimensions, such as location.
+
+For each `ORDER BY` column, TimescaleDB automatically creates additional columns
+that store the minimum and maximum value of that column. This way, the query
+planner can look at the range of timestamps in the compressed column, without
+having to do any decompression, and determine whether the row could possibly
+match the query.
+
+When you compress your hypertable, you can also choose to specify a `SEGMENT BY`
+column. This allows you to segment compressed rows by a specific column, so that
+each compressed row corresponds to a data about a single item such as, for
+example, a specific device ID. This further allows the query planner to
+determine if the row could possibly match the query without having to decompress
+the column first. For example:
+
+|Device ID|Timestamp|Status Code|Temperature|Min Timestamp|Max Timestamp|
+|-|-|-|-|-|-|
+|A|[12:00:01, 12:00:02, 12:00:03]|[0, 0, 0]|[70.11, 70.12, 70.14]|12:00:01|12:00:03|
+|B|[12:00:01, 12:00:02, 12:00:03]|[0, 0, 0]|[70.11, 70.12, 70.14]|12:00:01|12:00:03|
+
+With the data segmented in this way, a query for device A between a time
+interval becomes quite fast. The query planner can use an index to find those
+rows for device A that contain at least some timestamps corresponding to the
+specified interval, and even a sequential scan is quite fast since evaluating
+device IDs or timestamps does not require decompression. This means the
+query executor only decompresses the timestamp and temperature columns
+corresponding to those selected rows.
+
+[compression-methods]: /use-timescale/:currentVersion:/compression/compression-methods/