diff --git a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md index 44a0b67b4..359e2ea01 100644 --- a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md +++ b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function.md @@ -161,6 +161,7 @@ SELECT LEAST(temperature,humidity) FROM table2; | COUNT | Counts the number of data points. | All types | INT64 | | COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 | | APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) function provides an approximation of COUNT(DISTINCT x), returning the estimated number of distinct input values. | `x`: The target column to be calculated, supports all data types.
`maxStandardError` (optional): Specifies the maximum standard error allowed for the function's result. Valid range is [0.0040625, 0.26]. Defaults to 0.023 if not specified. | INT64 | +| APPROX_MOST_FREQUENT | The APPROX_MOST_FREQUENT(x, k, capacity) function is used to approximately calculate the top k most frequent elements in a dataset. It returns a JSON-formatted string where the keys are the element values and the values are their corresponding approximate frequencies. | `x` : The column to be calculated, supporting all existing data types in IoTDB;
`k`: The number of top-k most frequent values to return;
`capacity`: The number of buckets used for computation, which relates to memory usage—a larger value reduces error but consumes more memory, while a smaller value increases error but uses less memory. | STRING | | SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE | | AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE | | MAX | Finds the maximum value. | All types | Same as input type | @@ -251,8 +252,28 @@ Total line number = 1 It costs 0.022s ``` +#### 2.3.5 Approx_most_frequent -#### 2.3.5 First +Query the ​​top 2 most frequent values​​ in the `temperature` column of `table1`. + +```sql +IoTDB> select approx_most_frequent(temperature,2,100) as topk from table1; +``` + +The execution result is as follows: + +```sql ++-------------------+ +| topk| ++-------------------+ +|{"85.0":6,"90.0":5}| ++-------------------+ +Total line number = 1 +It costs 0.064s +``` + + +#### 2.3.6 First Finds the values with the smallest timestamp that are not NULL in the `temperature` and `humidity` columns. @@ -272,7 +293,7 @@ Total line number = 1 It costs 0.170s ``` -#### 2.3.6 Last +#### 2.3.7 Last Finds the values with the largest timestamp that are not NULL in the `temperature` and `humidity` columns. @@ -292,7 +313,7 @@ Total line number = 1 It costs 0.211s ``` -#### 2.3.7 First_by +#### 2.3.8 First_by Finds the `time` value of the row with the smallest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the smallest timestamp that is not NULL in the `temperature` column. @@ -312,7 +333,7 @@ Total line number = 1 It costs 0.269s ``` -#### 2.3.8 Last_by +#### 2.3.9 Last_by Queries the `time` value of the row with the largest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the largest timestamp that is not NULL in the `temperature` column. @@ -332,7 +353,7 @@ Total line number = 1 It costs 0.070s ``` -#### 2.3.9 Max_by +#### 2.3.10 Max_by Queries the `time` value of the row where the `temperature` column is at its maximum, and the `humidity` value of the row where the `temperature` column is at its maximum. @@ -352,7 +373,7 @@ Total line number = 1 It costs 0.172s ``` -#### 2.3.10 Min_by +#### 2.3.11 Min_by Queries the `time` value of the row where the `temperature` column is at its minimum, and the `humidity` value of the row where the `temperature` column is at its minimum. diff --git a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md index 44a0b67b4..359e2ea01 100644 --- a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md +++ b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function.md @@ -161,6 +161,7 @@ SELECT LEAST(temperature,humidity) FROM table2; | COUNT | Counts the number of data points. | All types | INT64 | | COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 | | APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) function provides an approximation of COUNT(DISTINCT x), returning the estimated number of distinct input values. | `x`: The target column to be calculated, supports all data types.
`maxStandardError` (optional): Specifies the maximum standard error allowed for the function's result. Valid range is [0.0040625, 0.26]. Defaults to 0.023 if not specified. | INT64 | +| APPROX_MOST_FREQUENT | The APPROX_MOST_FREQUENT(x, k, capacity) function is used to approximately calculate the top k most frequent elements in a dataset. It returns a JSON-formatted string where the keys are the element values and the values are their corresponding approximate frequencies. | `x` : The column to be calculated, supporting all existing data types in IoTDB;
`k`: The number of top-k most frequent values to return;
`capacity`: The number of buckets used for computation, which relates to memory usage—a larger value reduces error but consumes more memory, while a smaller value increases error but uses less memory. | STRING | | SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE | | AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE | | MAX | Finds the maximum value. | All types | Same as input type | @@ -251,8 +252,28 @@ Total line number = 1 It costs 0.022s ``` +#### 2.3.5 Approx_most_frequent -#### 2.3.5 First +Query the ​​top 2 most frequent values​​ in the `temperature` column of `table1`. + +```sql +IoTDB> select approx_most_frequent(temperature,2,100) as topk from table1; +``` + +The execution result is as follows: + +```sql ++-------------------+ +| topk| ++-------------------+ +|{"85.0":6,"90.0":5}| ++-------------------+ +Total line number = 1 +It costs 0.064s +``` + + +#### 2.3.6 First Finds the values with the smallest timestamp that are not NULL in the `temperature` and `humidity` columns. @@ -272,7 +293,7 @@ Total line number = 1 It costs 0.170s ``` -#### 2.3.6 Last +#### 2.3.7 Last Finds the values with the largest timestamp that are not NULL in the `temperature` and `humidity` columns. @@ -292,7 +313,7 @@ Total line number = 1 It costs 0.211s ``` -#### 2.3.7 First_by +#### 2.3.8 First_by Finds the `time` value of the row with the smallest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the smallest timestamp that is not NULL in the `temperature` column. @@ -312,7 +333,7 @@ Total line number = 1 It costs 0.269s ``` -#### 2.3.8 Last_by +#### 2.3.9 Last_by Queries the `time` value of the row with the largest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the largest timestamp that is not NULL in the `temperature` column. @@ -332,7 +353,7 @@ Total line number = 1 It costs 0.070s ``` -#### 2.3.9 Max_by +#### 2.3.10 Max_by Queries the `time` value of the row where the `temperature` column is at its maximum, and the `humidity` value of the row where the `temperature` column is at its maximum. @@ -352,7 +373,7 @@ Total line number = 1 It costs 0.172s ``` -#### 2.3.10 Min_by +#### 2.3.11 Min_by Queries the `time` value of the row where the `temperature` column is at its minimum, and the `humidity` value of the row where the `temperature` column is at its minimum. diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md index 528d738e4..55e10c4a1 100644 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md +++ b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function.md @@ -159,7 +159,8 @@ SELECT LEAST(temperature,humidity) FROM table2; |-----------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------------| | COUNT | 计算数据点数。 | 所有类型 | INT64 | | COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数 | exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20) | INT64 | -| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | x:待计算列,支持所有类型;
maxStandardError:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 | +| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | `x`:待计算列,支持所有类型;
`maxStandardError`:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 | +| APPROX_MOST_FREQUENT | APPROX_MOST_FREQUENT(x, k, capacity) 函数用于近似计算数据集中出现频率最高的前 k 个元素。它返回一个JSON 格式的字符串,其中键是该元素的值,值是该元素对应的近似频率。 | `x`:待计算列,支持 IoTDB 现有所有的数据类型;
`k`:返回出现频率最高的 k 个值;
`capacity`: 用于计算的桶的数量,跟内存占用相关:其值越大误差越小,但占用内存更大,反之capacity值越小误差越大,但占用内存更小。 | STRING | | SUM | 求和。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | | AVG | 求平均值。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | | MAX | 求最大值。 | 所有类型 | 与输入类型一致 | @@ -251,7 +252,28 @@ It costs 0.022s ``` -#### 2.3.5 First +#### 2.3.5 Approx_most_frequent + +查询 `table1` 中 `temperature` 列出现频次最高的2个值 + +```sql +IoTDB> select approx_most_frequent(temperature,2,100) as topk from table1; +``` + +执行结果如下: + +```sql ++-------------------+ +| topk| ++-------------------+ +|{"85.0":6,"90.0":5}| ++-------------------+ +Total line number = 1 +It costs 0.064s +``` + + +#### 2.3.6 First 查询`temperature`列、`humidity`列时间戳最小且不为 NULL 的值。 @@ -271,7 +293,7 @@ Total line number = 1 It costs 0.170s ``` -#### 2.3.6 Last +#### 2.3.7 Last 查询`temperature`列、`humidity`列时间戳最大且不为 NULL 的值。 @@ -291,7 +313,7 @@ Total line number = 1 It costs 0.211s ``` -#### 2.3.7 First_by +#### 2.3.8 First_by 查询 `temperature` 列中非 NULL 且时间戳最小的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最小的行的 `humidity` 值。 @@ -311,7 +333,7 @@ Total line number = 1 It costs 0.269s ``` -#### 2.3.8 Last_by +#### 2.3.9 Last_by 查询`temperature` 列中非 NULL 且时间戳最大的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最大的行的 `humidity` 值。 @@ -331,7 +353,7 @@ Total line number = 1 It costs 0.070s ``` -#### 2.3.9 Max_by +#### 2.3.10 Max_by 查询`temperature` 列中最大值所在行的 `time` 值,以及`temperature` 列中最大值所在行的 `humidity` 值。 @@ -351,7 +373,7 @@ Total line number = 1 It costs 0.172s ``` -#### 2.3.10 Min_by +#### 2.3.11 Min_by 查询`temperature` 列中最小值所在行的 `time` 值,以及`temperature` 列中最小值所在行的 `humidity` 值。 diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md index 528d738e4..9a66103b7 100644 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md +++ b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function.md @@ -159,7 +159,8 @@ SELECT LEAST(temperature,humidity) FROM table2; |-----------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------------| | COUNT | 计算数据点数。 | 所有类型 | INT64 | | COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数 | exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20) | INT64 | -| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | x:待计算列,支持所有类型;
maxStandardError:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 | +| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | `x`:待计算列,支持所有类型;
`maxStandardError`:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 | +| APPROX_MOST_FREQUENT | APPROX_MOST_FREQUENT(x, k, capacity) 函数用于近似计算数据集中出现频率最高的前 k 个元素。它返回一个JSON 格式的字符串,其中键是该元素的值,值是该元素对应的近似频率。 | `x`:待计算列,支持 IoTDB 现有所有的数据类型;
`k`:返回出现频率最高的 k 个值;
`capacity`: 用于计算的桶的数量,跟内存占用相关:其值越大误差越小,但占用内存更大,反之capacity值越小误差越大,但占用内存更小。 | STRING | | SUM | 求和。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | | AVG | 求平均值。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | | MAX | 求最大值。 | 所有类型 | 与输入类型一致 | @@ -250,8 +251,28 @@ Total line number = 1 It costs 0.022s ``` +#### 2.3.5 Approx_most_frequent -#### 2.3.5 First +查询 `table1` 中 `temperature` 列出现频次最高的2个值 + +```sql +IoTDB> select approx_most_frequent(temperature,2,100) as topk from table1; +``` + +执行结果如下: + +```sql ++-------------------+ +| topk| ++-------------------+ +|{"85.0":6,"90.0":5}| ++-------------------+ +Total line number = 1 +It costs 0.064s +``` + + +#### 2.3.6 First 查询`temperature`列、`humidity`列时间戳最小且不为 NULL 的值。 @@ -271,7 +292,7 @@ Total line number = 1 It costs 0.170s ``` -#### 2.3.6 Last +#### 2.3.7 Last 查询`temperature`列、`humidity`列时间戳最大且不为 NULL 的值。 @@ -291,7 +312,7 @@ Total line number = 1 It costs 0.211s ``` -#### 2.3.7 First_by +#### 2.3.8 First_by 查询 `temperature` 列中非 NULL 且时间戳最小的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最小的行的 `humidity` 值。 @@ -311,7 +332,7 @@ Total line number = 1 It costs 0.269s ``` -#### 2.3.8 Last_by +#### 2.3.9 Last_by 查询`temperature` 列中非 NULL 且时间戳最大的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最大的行的 `humidity` 值。 @@ -331,7 +352,7 @@ Total line number = 1 It costs 0.070s ``` -#### 2.3.9 Max_by +#### 2.3.10 Max_by 查询`temperature` 列中最大值所在行的 `time` 值,以及`temperature` 列中最大值所在行的 `humidity` 值。 @@ -351,7 +372,7 @@ Total line number = 1 It costs 0.172s ``` -#### 2.3.10 Min_by +#### 2.3.11 Min_by 查询`temperature` 列中最小值所在行的 `time` 值,以及`temperature` 列中最小值所在行的 `humidity` 值。