Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(query): support CHANGES clause #15163

Merged
merged 9 commits into from
Apr 7, 2024
Merged

Conversation

zhyass
Copy link
Member

@zhyass zhyass commented Apr 3, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

  1. Remove since clause in feat(query): new implementation of analyze table #14725
    SELECT ...
    FROM <fuse_table>
    [ AT ( { SNAPSHOT => <snapshot_id> | TIMESTAMP => } ) ]
    [ SINCE ( { SNAPSHOT => <snapshot_id> | TIMESTAMP => } ) ];

  2. Support changes clause.

SELECT ...
FROM ...
   CHANGES ( INFORMATION => { DEFAULT | APPEND_ONLY } )
   AT ( { TIMESTAMP => <timestamp> | SNAPSHOT => <sid> | OFFSET => <time_difference> | STREAM => '<name>' } )
   [ END( { TIMESTAMP => <timestamp> | OFFSET => <time_difference> | SNAPSHOT => <sid> } ) ]
[ ... ]

INFORMATION => { DEFAULT | APPEND_ONLY }: Specifies the type of change tracking data to return.

DEFAULT: Like Standard Stream, which returns all DML changes, including inserts, updates, and deletes.

APPEND_ONLY: Like AppendOnly Stream, will returns appended rows only.

The AT clause is required and sets the current offset for the change tracking.

The optional END clause sets the end snapshot for the change interval. If no END value is specified, the latest snapshot is used as the end of the change interval.

mysql> create table t(a int, b int)change_tracking=true;
Query OK, 0 rows affected (0.10 sec)

mysql> insert into t values(1,1),(2,1);
Query OK, 2 rows affected (0.10 sec)

mysql> create stream s on table t append_only = true;
Query OK, 0 rows affected (0.10 sec)

mysql> create stream s1 on table t append_only = false;
Query OK, 0 rows affected (0.13 sec)

mysql> update t set b = 2 where a = 2;
Query OK, 1 row affected (0.39 sec)

mysql> insert into t values(3, 3);
Query OK, 1 row affected (0.18 sec)

mysql> select * from t order by a;
+------+------+
| a    | b    |
+------+------+
|    1 |    1 |
|    2 |    2 |
|    3 |    3 |
+------+------+
3 rows in set (0.17 sec)
Read 3 rows, 28.00 B in 0.085 sec., 35.49 rows/sec., 331.28 B/sec.

mysql> select * from s;
+------+------+---------------+------------------+----------------------------------------+
| a    | b    | change$action | change$is_update | change$row_id                          |
+------+------+---------------+------------------+----------------------------------------+
|    3 |    3 | INSERT        |                0 | 8ba7cb6acc7e4dd5aab03f9d32e0ccb8000000 |
+------+------+---------------+------------------+----------------------------------------+
1 row in set (0.19 sec)
Read 3 rows, 287.00 B in 0.021 sec., 144.64 rows/sec., 13.51 KiB/sec.

mysql> select * from s1 order by a,b;
+------+------+---------------+----------------------------------------+------------------+
| a    | b    | change$action | change$row_id                          | change$is_update |
+------+------+---------------+----------------------------------------+------------------+
|    2 |    1 | DELETE        | d60dab8545424894ba9ea2cb0baad51e000001 |                1 |
|    2 |    2 | INSERT        | d60dab8545424894ba9ea2cb0baad51e000001 |                1 |
|    3 |    3 | INSERT        | 8ba7cb6acc7e4dd5aab03f9d32e0ccb8000000 |                0 |
+------+------+---------------+----------------------------------------+------------------+
3 rows in set (0.40 sec)
Read 5 rows, 496.00 B in 0.111 sec., 45.16 rows/sec., 4.38 KiB/sec.

mysql> select * from t changes(information => default) at (stream => s) order by a, b;
+------+------+---------------+----------------------------------------+------------------+
| a    | b    | change$action | change$row_id                          | change$is_update |
+------+------+---------------+----------------------------------------+------------------+
|    2 |    1 | DELETE        | d60dab8545424894ba9ea2cb0baad51e000001 |                1 |
|    2 |    2 | INSERT        | d60dab8545424894ba9ea2cb0baad51e000001 |                1 |
|    3 |    3 | INSERT        | 8ba7cb6acc7e4dd5aab03f9d32e0ccb8000000 |                0 |
+------+------+---------------+----------------------------------------+------------------+
3 rows in set (0.39 sec)
Read 5 rows, 496.00 B in 0.101 sec., 49.6 rows/sec., 4.81 KiB/sec.

mysql> select * from t changes(information => append_only) at (stream => s) order by a, b;
+------+------+---------------+------------------+----------------------------------------+
| a    | b    | change$action | change$is_update | change$row_id                          |
+------+------+---------------+------------------+----------------------------------------+
|    3 |    3 | INSERT        |                0 | 8ba7cb6acc7e4dd5aab03f9d32e0ccb8000000 |
+------+------+---------------+------------------+----------------------------------------+
1 row in set (0.21 sec)
Read 3 rows, 287.00 B in 0.021 sec., 142.1 rows/sec., 13.28 KiB/sec.

mysql> select * from fuse_snapshot('default', 't');
+----------------------------------+----------------------------------------------------+----------------+----------------------------------+---------------+-------------+-----------+--------------------+------------------+------------+----------------------------+
| snapshot_id                      | snapshot_location                                  | format_version | previous_snapshot_id             | segment_count | block_count | row_count | bytes_uncompressed | bytes_compressed | index_size | timestamp                  |
+----------------------------------+----------------------------------------------------+----------------+----------------------------------+---------------+-------------+-----------+--------------------+------------------+------------+----------------------------+
| 95a2f43e12514a389fe8d0ff2e04e20b | 1/2414/_ss/95a2f43e12514a389fe8d0ff2e04e20b_v4.mpk |              4 | 459a0bcba385421d83fe5c2b361e2d4c |             2 |           2 |         3 |                111 |             2202 |       1050 | 2024-04-03 14:34:37.024354 |
| 459a0bcba385421d83fe5c2b361e2d4c | 1/2414/_ss/459a0bcba385421d83fe5c2b361e2d4c_v4.mpk |              4 | 1293db921f8d4042b396aa51855c988b |             1 |           1 |         2 |                101 |             1648 |        525 | 2024-04-03 14:34:34.307494 |
| 1293db921f8d4042b396aa51855c988b | 1/2414/_ss/1293db921f8d4042b396aa51855c988b_v4.mpk |              4 | NULL                             |             1 |           1 |         2 |                 18 |              564 |        525 | 2024-04-03 14:34:06.266322 |
+----------------------------------+----------------------------------------------------+----------------+----------------------------------+---------------+-------------+-----------+--------------------+------------------+------------+----------------------------+
3 rows in set (0.10 sec)
Read 3 rows, 600.00 B in 0.069 sec., 43.71 rows/sec., 8.54 KiB/sec.

mysql> select * from t changes(information => default) at (snapshot => '1293db921f8d4042b396aa51855c988b') end(timestamp => '2024-04-03 14:34:34.307494'::TIMESTAMP) order by a, b;
+------+------+---------------+----------------------------------------+------------------+
| a    | b    | change$action | change$row_id                          | change$is_update |
+------+------+---------------+----------------------------------------+------------------+
|    2 |    1 | DELETE        | d60dab8545424894ba9ea2cb0baad51e000001 |                1 |
|    2 |    2 | INSERT        | d60dab8545424894ba9ea2cb0baad51e000001 |                1 |
+------+------+---------------+----------------------------------------+------------------+
2 rows in set (0.30 sec)
Read 4 rows, 294.00 B in 0.055 sec., 72.95 rows/sec., 5.24 KiB/sec.

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@zhyass zhyass marked this pull request as draft April 3, 2024 08:51
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Apr 3, 2024
@zhyass zhyass force-pushed the feature_fix branch 2 times, most recently from 9c444c6 to 239b804 Compare April 3, 2024 14:43
@zhyass zhyass marked this pull request as ready for review April 3, 2024 16:03
@zhyass zhyass requested review from b41sh, sundy-li and dantengsky April 3, 2024 16:03
@zhyass zhyass force-pushed the feature_fix branch 2 times, most recently from 26adf7a to de5612d Compare April 3, 2024 16:19
@zhyass zhyass added this pull request to the merge queue Apr 7, 2024
Merged via the queue into databendlabs:main with commit 94788f8 Apr 7, 2024
75 checks passed
@zhyass zhyass deleted the feature_fix branch April 7, 2024 04:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: support CHANGES clause
3 participants