Add alter retention example. Fixes #48

Edit regex claim. Fixes #51 SHOW MEASUREMENTS works with regexes. Fixes #46 Moves some other stuff around.
influxdata · Dec 17, 2015 · 67b9bc8 · 67b9bc8
1 parent 5fddbd3
commit 67b9bc8
Show file tree

Hide file tree

Showing 6 changed files with 97 additions and 15 deletions.
diff --git a/content/influxdb/v0.9/query_language/data_exploration.md b/content/influxdb/v0.9/query_language/data_exploration.md
@@ -169,7 +169,7 @@ Return data where the tag key `location` has the tag value `santa_monica`:
 ```
 * Always single quote tag values in queries - they are strings. Note that double quotes do not work when specifying tag values and can cause queries to silently fail.   
 
-> **Note:** Tags are indexed so queries on tag keys or tag values are highly performant.
+> **Note:** Tags are indexed so queries on tag keys or tag values are more performant than queries on fields.
 
 Return data where the tag key `location` has no tag value (more on regular expressions [later](/influxdb/v0.9/query_language/data_exploration/#regular-expressions-in-queries)):
 ```sql
@@ -198,11 +198,11 @@ Return data where the tag key `location` has the tag value `santa_monica` and th
 ```
 * Always single quote field values that are strings. Note that double quotes do not work when specifying string field values and can cause queries to silently fail.
 
-> **Note:** Fields are not indexed so queries on field keys or field values are not performant.
+> **Note:** Fields are not indexed; queries on fields are not as performant as those on tags.
 
 More on the `WHERE` clause in InfluxQL:
 
-* The `WHERE` clause supports comparisons against regular expressions, strings, booleans, floats, integers, and against the `time` of the timestamp.
+* The `WHERE` clause supports comparisons against strings, booleans, floats, integers, and against the `time` of the timestamp. It supports using regular expressions to match tags, but not to match fields.
 * Chain logic together using `AND`  and `OR`, and separate using `(` and `)`.
 * Acceptable comparators include:  
 `=` equal to  
@@ -392,7 +392,7 @@ time			               written
 ### Downsample data
 Combine the `INTO` clause with an InfluxQL [function](/influxdb/v0.9/query_language/functions/) and a `GROUP BY` clause to write the lower precision query results to a different measurement:
 ```sql
-SELECT <function>(<field_key>) INTO <different_measurement> FROM <current-measurement> WHERE <stuff> GROUP BY <stuff>
+SELECT <function>(<field_key>) INTO <different_measurement> FROM <current_measurement> WHERE <stuff> GROUP BY <stuff>
 ```
 
 > **Note:** The `INTO` queries in this section downsample old data, that is, data that have already been written to InfluxDB. If you want InfluxDB to automatically query and downsample all future data see [Continuous Queries](/influxdb/v0.9/query_language/continuous_queries/).
@@ -745,7 +745,7 @@ Return all points that occur after  `2014-01-01 00:00:00`:
 
 Regular expressions are surrounded by `/` characters and use [Golang's regular expression syntax](http://golang.org/pkg/regexp/syntax/). Use regular expressions when selecting measurements and tags.
 
->**Note:** You cannot use regular expressions to match databases, retention policies, or fields. You can only use regular expressions to match measurements and tags
+>**Note:** You cannot use regular expressions to match databases, retention policies, or fields. You can only use regular expressions to match measurements and tags.
 
 The [sample data](/influxdb/v0.9/query_language/data_exploration/#sample-data) need to be more intricate for the following sections. Assume that the database `NOAA_water_database` now holds several measurements: `h2o_feet`, `h2o_quality`, `h2o_pH`, `average_temperature`, and `h2o_temperature`. Please note that every measurement besides `h2o_feet` is fictional and contains fictional data.
 

diff --git a/content/influxdb/v0.9/query_language/schema_exploration.md b/content/influxdb/v0.9/query_language/schema_exploration.md
@@ -156,9 +156,9 @@ h2o_quality,location=coyote_creek,randtag=3	   coyote_creek	   3
 ```
 
 ## Explore measurements with `SHOW MEASUREMENTS`
-The `SHOW MEASUREMENTS` query returns all [measurements](/influxdb/v0.9/concepts/glossary/#measurement) in your database and it takes the following form, where the `WHERE` clause is optional:
+The `SHOW MEASUREMENTS` query returns the [measurements](/influxdb/v0.9/concepts/glossary/#measurement) in your database and it takes the following form:
 ```sql
-SHOW MEASUREMENTS [WHERE <tag_key>=<'tag_value'>]
+SHOW MEASUREMENTS [WITH MEASUREMENT <regular_expression>] [WHERE <tag_key>=<'tag_value'>]
 ```
 
 Return all measurements in the `NOAA_water_database` database:
@@ -208,6 +208,22 @@ name
 h2o_quality
 ```
 
+Use a regular expression with `WITH MEASUREMENT` to return all measurements that start with `h2o`:
+```sql
+> SHOW MEASUREMENTS WITH MEASUREMENT =~ /h2o.*/
+```
+
+CLI response:
+```sh
+name: measurements
+------------------
+name
+h2o_feet
+h2o_pH
+h2o_quality
+h2o_temperature
+```
+
 ## Explore tag keys with SHOW TAG KEYS
 `SHOW TAG KEYS` returns the [tag keys](/influxdb/v0.9/concepts/glossary/#tag-key) associated with each measurement and takes the following form, where the `FROM` clause is optional:
 ```sql

diff --git a/content/influxdb/v0.9/troubleshooting/frequently_encountered_issues.md b/content/influxdb/v0.9/troubleshooting/frequently_encountered_issues.md
@@ -24,7 +24,6 @@ This page addresses frequent sources of confusion and places where InfluxDB beha
 * [Getting the `expected identifier` error, unexpectedly](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#getting-the-expected-identifier-error-unexpectedly)
 * [Identifying write precision from returned timestamps](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#identifying-write-precision-from-returned-timestamps)  
 * [Single quoting and double quoting in queries](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#single-quoting-and-double-quoting-in-queries)  
-* [Writing more than one continuous query to a single series](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#writing-more-than-one-continuous-query-to-a-single-series)
 * [Missing data after creating a new `DEFAULT` retention policy](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#missing-data-after-creating-a-new-default-retention-policy)
 
 **Writing data**  
@@ -33,14 +32,16 @@ This page addresses frequent sources of confusion and places where InfluxDB beha
 * [Writing data with negative timestamps](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#writing-data-with-negative-timestamps)  
 * [Writing duplicate points](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#writing-duplicate-points)  
 * [Getting an unexpected error when sending data over the HTTP API](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#getting-an-unexpected-error-when-sending-data-over-the-http-api)
+* [Writing more than one continuous query to a single series](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#writing-more-than-one-continuous-query-to-a-single-series)
 * [Words and characters to avoid](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#words-and-characters-to-avoid)  
 * [Single quoting and double quoting when writing data](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#single-quoting-and-double-quoting-when-writing-data)  
 
 **Administration**  
 
 * [Single quoting the password string](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#single-quoting-the-password-string)
 * [Escaping the single quote in a password](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#escaping-the-single-quote-in-a-password)  
-* [Identifying your version of InfluxDB](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#identifying-your-version-of-influxdb)
+* [Identifying your version of InfluxDB](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#identifying-your-version-of-influxdb)  
+* [Data aren't dropped after altering a retention policy](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#data-aren-t-dropped-after-altering-a-retention-policy)
 
 
 # Querying data
@@ -172,14 +173,32 @@ No: `SELECT * from cr@zy where p^e='2'`
 
 See the [Query Syntax](/influxdb/v0.9/query_language/query_syntax/) page for more information.
 
-## Writing more than one continuous query to a single series
-Use a single continuous query to write several statistics to the same measurement and tag set. For example, tell InfluxDB to write to the `aggregated_stats` measurement the `MEAN` and `MIN` of the `value` field grouped by five-minute intervals and grouped by the `cpu` tag with:
-
-`CREATE CONTINUOUS QUERY mean_min_value ON telegraf BEGIN SELECT MEAN(value) AS mean, MIN(value) AS min INTO aggregated_stats FROM cpu_idle GROUP BY time(5m),cpu END`
+## Missing data after creating a new `DEFAULT` retention policy
+When you create a new `DEFAULT` retention policy (RP) on a database, the data written to the old `DEFAULT` RP remain in the old RP. Queries that do not specify an RP automatically query the new `DEFAULT` RP so the old data may appear to be missing. To query the old data you must fully qualify the relevant data in the query.
 
-If you create two separate continuous queries (one for calculating the `MEAN` and one for calculating the `MIN`), the `aggregated_stats` measurement will appear to be missing data. Separate continuous queries run at slightly different times and InfluxDB defines a unique point by its measurement, tag set, and timestamp (notice that field is missing from that list). So if two continuous queries write to different fields but also write to the same measurement and tag set, only one of the two fields will ever have data; the last continuous query to run will overwrite the results that were written by the first continuous query with the same timestamp.
+Example:
 
-For more on continuous queries, see the [continuous queries page](/influxdb/v0.9/query_language/continuous_queries/).
+All of the data in the measurement `fleeting` fall under the `DEFAULT` RP called `one_hour`:
+```sh
+> SELECT count(flounders) FROM fleeting
+name: fleeting
+--------------
+time			               count
+1970-01-01T00:00:00Z	 8
+```
+We [create](/influxdb/v0.9/query_language/database_management/#create-retention-policies-with-create-retention-policy) a new `DEFAULT` RP (`two_hour`) and perform the same query:
+```sh
+> SELECT count(flounders) FROM fleeting
+>
+```
+To query the old data, we must specify the old `DEFAULT` RP by fully qualifying `fleeting`:
+```sh
+> SELECT count(flounders) FROM fish.one_hour.fleeting
+name: fleeting
+--------------
+time			               count
+1970-01-01T00:00:00Z	 8
+```
 
 ## Missing data after creating a new `DEFAULT` retention policy
 When you create a new `DEFAULT` retention policy (RP) on a database, the data written to the old `DEFAULT` RP remain in the old RP. Queries that do not specify an RP automatically query the new `DEFAULT` RP so the old data may appear to be missing. To query the old data you must fully qualify the relevant data in the query.
@@ -244,6 +263,17 @@ First, double check your [line protocol](/influxdb/v0.9/write_protocols/line/) s
 
 > **Note:** If you generated your data file on a Windows machine, Windows uses carriage return and line feed (`\r\n`) as the newline character.
 
+## Writing more than one continuous query to a single series
+Use a single continuous query to write several statistics to the same measurement and tag set. For example, tell InfluxDB to write to the `aggregated_stats` measurement the `MEAN` and `MIN` of the `value` field grouped by five-minute intervals and grouped by the `cpu` tag with:
+
+```sql
+CREATE CONTINUOUS QUERY mean_min_value ON telegraf BEGIN SELECT MEAN(value) AS mean, MIN(value) AS min INTO aggregated_stats FROM cpu_idle GROUP BY time(5m),cpu END
+```
+
+If you create two separate continuous queries (one for calculating the `MEAN` and one for calculating the `MIN`), the `aggregated_stats` measurement will appear to be missing data. Separate continuous queries run at slightly different times and InfluxDB defines a unique point by its measurement, tag set, and timestamp (notice that field is missing from that list). So if two continuous queries write to different fields but also write to the same measurement and tag set, only one of the two fields will ever have data; the last continuous query to run will overwrite the results that were written by the first continuous query with the same timestamp.
+
+For more on continuous queries, see [Continuous Queries](/influxdb/v0.9/query_language/continuous_queries/).
+
 ## Words and characters to avoid
 If you use any of the [InfluxQL keywords](https://github.com/influxdb/influxdb/blob/master/influxql/INFLUXQL.md#keywords) as an identifier you will need to double quote that identifier in every query. This can lead to [non-intuitive errors](/influxdb/v0.9/troubleshooting/frequently_encountered_issues/#getting-the-expected-identifier-error-unexpectedly). Identifiers are database names, retention policy names, user names, measurement names, tag keys, and field keys.
 
@@ -312,3 +342,39 @@ There a number of ways to identify the version of InfluxDB that you're using:
 * Check the HTTP response in your logs:  
 
 `[http] 2015/09/04 12:29:07 ::1 - - [04/Sep/2015:12:29:06 -0700] GET /query?db=&q=create+database+there_you_go HTTP/1.1 200 40 -` ✨`InfluxDBShell/0.9.3`✨ `357970a0-533b-11e5-8001-000000000000 6.07408ms`
+
+## Data aren't dropped after altering a retention policy
+After [shortening](/influxdb/v0.9/query_language/database_management/#modify-retention-policies-with-alter-retention-policy) the `DURATION` of a [retention policy](/influxdb/v0.9/concepts/glossary/#retention-policy-rp) (RP), you may notice that InfluxDB keeps some data that are older than the `DURATION` of the modified RP. This behavior is a result of the relationship between the time interval covered by a shard group and the `DURATION` of a retention policy.
+
+InfluxDB stores data in shard groups. A single shard group covers a specific time interval; InfluxDB determines that time interval by looking at the `DURATION` of the relevant RP. The table below outlines the relationship between the `DURATION` of an RP and the time interval of a shard group:
+
+| RP duration  | Shard group interval  |
+|---|---|
+| < 2 days  | 1 hour  |
+| >= 2 days and <= 6 months  | 1 day  |
+| > 6 months  | 7 days  |
+
+If you shorten the `DURATION` of an RP and the shard group interval also shrinks, InfluxDB may be forced to keep data that are older than the new `DURATION`. This happens because InfluxDB cannot divide the old, longer shard group into new, shorter shard groups; it must keep all of the data in the longer shard group even if only a small part of those data overlaps with the new `DURATION`.
+
+*Example: Moving from an infinite RP to a three day RP*
+
+Figure 1 shows the shard groups for our example database (`example_db`) after 11 days. The database uses the automatically generated `default` retention policy with an infinite (`INF`) `DURATION` so each shard group interval is seven days. On day 11, InfluxDB is no longer writing to `Shard Group 1` and `Shard Group 2` has four days worth of data:
+
+**Figure 1**
+![Retention policy duration infinite](/img/influxdb/fei/alter-rp-inf.png)
+
+On day 11, we notice that `example_db` is accruing data too fast; we want to delete, and keep deleting, all data older than three days. We do this by [altering](/influxdb/v0.9/query_language/database_management/#modify-retention-policies-with-alter-retention-policy) the retention policy:
+<br>
+<br>
+```
+> ALTER RETENTION POLICY default ON example_db DURATION 3d
+```
+
+At the next [retention policy enforcement check](/influxdb/v0.9/administration/config/#retention), InfluxDB immediately drops `Shard Group 1` because all of its data are older than 3 days. InfluxDB does not drop `Shard Group 2`. This is because InfluxDB cannot divide existing shard groups and some data in `Shard Group 2` still fall within the new three day retention policy.
+
+Figure 2 shows the shard groups for `example_db` five days after the retention policy change. Notice that the new shard groups span one day intervals. All of the data in `Shard Group 2` remain in the database because the shard group still has data within the retention policy's three day `DURATION`:
+
+**Figure 2**
+![Retention policy duration three days](/img/influxdb/fei/alter-rp-3d.png)
+
+After day 17, all data within the past 3 days will be in one day shard groups. InfluxDB will then be able to drop `Shard Group 2` and `example_db` will have only 3 days worth of data.
diff --git a/static/img/influxdb/fei/alter-rp-3d.png b/static/img/influxdb/fei/alter-rp-3d.png
diff --git a/static/img/influxdb/fei/alter-rp-inf.png b/static/img/influxdb/fei/alter-rp-inf.png
diff --git a/static/img/influxdb/sketch.png b/static/img/influxdb/sketch.png