Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
104528: changefeedccl: add full support for the parquet format r=miretskiy a=jayshrivastava ### changefeedccl: support key_in_value with parquet format Previously, the option `key_in_value` was disallowed with `format=parquet`. This change allows these settings to be used together. Note that `key_in_value` is enabled by default with `cloudstorage` sinks and `format=parquet` is only allowed with cloudstorage sinks, so `key_in_value` is enabled for parquet by default. Informs: #103129 Informs: #99028 Epic: [CRDB-27372](https://cockroachlabs.atlassian.net/browse/CRDB-27372) Release note: None --- ### changefeedccl: add test coverage for parquet event types When using `format=parquet`, an additional column is produced to indicate the type of operation corresponding to the row: create, update, or delete. This change adds coverage for this in unit testing. Additionally, the test modified in this change is made more simple by reducing the number of rows and different types because this complexity is unnecessary as all types are tested within the util/parquet package already. Informs: #99028 Epic: [CRDB-27372](https://cockroachlabs.atlassian.net/browse/CRDB-27372) Release note: None Epic: None --- ### util/parquet: support tuple labels in util/parquet testutils Previously, the test utilities in `util/parquet` would not reconstruct tuples read from files with their labels. This change updates the package to do so. This is required for testing in users of this package such as CDC. Informs: #99028 Epic: [CRDB-27372](https://cockroachlabs.atlassian.net/browse/CRDB-27372) Release note: None --- ### changefeedccl: support diff option with parquet format This change adds support for the `diff` changefeed options when using `format=parquet`. Enabling `diff` also adds support for CDC Transformations with parquet. Informs: #103129 Informs: #99028 Epic: [CRDB-27372](https://cockroachlabs.atlassian.net/browse/CRDB-27372) Release note: None --- ### changefeedccl: support end_time option with parquet format This change adds support for the `end_time` changefeed options when using `format=parquet`. No significant code changes were needed to enable this feature. Closes: #103129 Closes: #99028 Epic: [CRDB-27372](https://cockroachlabs.atlassian.net/browse/CRDB-27372) Release note (enterprise change): Changefeeds now officially support the parquet format at specificiation version 2.6. It is only usable with the cloudstorage sink. The syntax to use parquet is like the following: `CREATE CHANGEFEED FOR foo INTO `...` WITH format=parquet` It supports all standard changefeed options and features including CDC transformations, except it does not support the `topic_in_value` option. --- ### changefeedccl: use parquet with 50% probability in nemeses test Informs: #99028 Epic: [CRDB-27372](https://cockroachlabs.atlassian.net/browse/CRDB-27372) Release note: None --- ### do not merge: force parquet cloud storage tests This change forces all tests, including tests for `diff` and `end_time` to run with the `cloudstorage` sink and `format=parquet` where possible. Informs: #103129 Informs: #99028 Epic: [CRDB-27372](https://cockroachlabs.atlassian.net/browse/CRDB-27372) Release note: None Co-authored-by: Jayant Shrivastava <jayants@cockroachlabs.com>
- Loading branch information