Skip to content

Commit

Permalink
ticdc: add binary_encoding_method to openapi v2 (#14596) (#15616)
Browse files Browse the repository at this point in the history
  • Loading branch information
ti-chi-bot authored Dec 7, 2023
1 parent d000f96 commit a6d5b58
Show file tree
Hide file tree
Showing 3 changed files with 48 additions and 1 deletion.
31 changes: 30 additions & 1 deletion ticdc/ticdc-changefeed-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,9 +180,18 @@ flush-interval = 2000
# The storage URI of the redo log.
# The default value is empty.
storage = ""
# Specifies whether to store the redo log in a file.
# Specifies whether to store the redo log in a local file.
# The default value is false.
use-file-backend = false
# The number of encoding and decoding workers in the redo module.
# The default value is 16.
encoding-worker-num = 16
# The number of flushing workers in the redo module.
# The default value is 8.
flush-worker-num = 8
# The behavior to compress redo log files.
# Available options are "" and "lz4". The default value is "", which means no compression.
compression = ""

# The following configuration items only take effect when the downstream is Kafka. Supported starting from v6.5.3.
[sink.kafka-config]
Expand All @@ -200,4 +209,24 @@ sasl-oauth-scopes = ["producer.kafka", "consumer.kafka"]
sasl-oauth-grant-type = "client_credentials"
# The audience in the Kafka SASL OAUTHBEARER authentication. The default value is empty. This parameter is optional when the OAUTHBEARER authentication is used.
sasl-oauth-audience = "kafka"

[sink.cloud-storage-config]
# The concurrency for saving data changes to the downstream cloud storage.
# The default value is 16.
worker-count = 16
# The interval for saving data changes to the downstream cloud storage.
# The default value is "2s".
flush-interval = "2s"
# A data change file is saved to the cloud storage when the number of bytes in this file exceeds `file-size`.
# The default value is 67108864 (this is, 64 MiB).
file-size = 67108864
# The duration to retain files, which takes effect only when `date-separator` is configured as `day`. Assume that `file-expiration-days = 1` and `file-cleanup-cron-spec = "0 0 0 * * *"`, then TiCDC performs daily cleanup at 00:00:00 for files saved beyond 24 hours. For example, at 00:00:00 on 2023/12/02, TiCDC cleans up files generated before 2023/12/01, while files generated on 2023/12/01 remain unaffected.
# The default value is 0, which means file cleanup is disabled.
file-expiration-days = 0
# The running cycle of the scheduled cleanup task, compatible with the crontab configuration, with a format of `<Second> <Minute> <Hour> <Day of the month> <Month> <Day of the week (Optional)>`
# The default value is "0 0 2 * * *", which means that the cleanup task is executed every day at 2 AM.
file-cleanup-cron-spec = "0 0 2 * * *"
# The concurrency for uploading a single file.
# The default value is 1, which means concurrency is disabled.
flush-concurrency = 1
```
16 changes: 16 additions & 0 deletions ticdc/ticdc-open-api-v2.md
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,9 @@ The `consistent` parameters are described as follows:
| `level` | `STRING` type. The consistency level of the replicated data. (Optional) |
| `max_log_size` | `UINT64` type. The maximum value of redo log. (Optional) |
| `storage` | `STRING` type. The destination address of the storage. (Optional) |
| `use_file_backend` | `BOOL` type. Specifies whether to store the redo log in a local file. (Optional) |
| `encoding_worker_num` | `INT` type. The number of encoding and decoding workers in the redo module. (Optional) |
| `flush_worker_num` | `INT` type. The number of flushing workers in the redo module. (Optional) |

The `filter` parameters are described as follows:

Expand Down Expand Up @@ -333,6 +336,7 @@ The `sink` parameters are described as follows:
| `schema_registry` | `STRING` type. The schema registry address. (Optional) |
| `terminator` | `STRING` type. The terminator is used to separate two data change events. The default value is null, which means `"\r\n"` is used as the terminator. (Optional) |
| `transaction_atomicity` | `STRING` type. The atomicity level of the transaction. (Optional) |
| `cloud_storage_config` | The storage sink configuration. (Optional) |

`sink.column_selectors` is an array. The parameters are described as follows:

Expand All @@ -349,6 +353,7 @@ The `sink.csv` parameters are described as follows:
| `include_commit_ts` | `BOOLEAN` type. Whether to include commit-ts in CSV rows. The default value is `false`. |
| `null` | `STRING` type. The character that is displayed when a CSV column is null. The default value is `\N`. |
| `quote` | `STRING` type. The quotation character used to surround fields in the CSV file. If the value is empty, no quotation is used. The default value is `"`. |
| `binary_encoding_method` | `STRING` type. The encoding method of binary data, which can be `"base64"` or `"hex"`. The default value is `"base64"`. |

`sink.dispatchers`: for the sink of MQ type, you can use this parameter to configure the event dispatcher. The following dispatchers are supported: `default`, `ts`, `rowid`, and `table`. The dispatcher rules are as follows:

Expand All @@ -365,6 +370,17 @@ The `sink.csv` parameters are described as follows:
| `partition` | `STRING` type. The target partition for dispatching events. |
| `topic` | `STRING` type. The target topic for dispatching events. |

`sink.cloud_storage_config` parameters are described as follows:

| Parameter name | Description |
|:-----------------|:---------------------------------------|
| `worker_count` | `INT` type. The concurrency for saving data changes to the downstream cloud storage. |
| `flush_interval` | `STRING` type. The interval for saving data changes to the downstream cloud storage. |
| `file_size` | `INT` type. A data change file is saved to the cloud storage when the number of bytes in this file exceeds the value of this parameter. |
| `file_expiration_days` | `INT` type. The duration to retain files, which takes effect only when `date-separator` is configured as `day`. |
| `file_cleanup_cron_spec` | `STRING` type. The running cycle of the scheduled cleanup task, compatible with the crontab configuration, with a format of `<Second> <Minute> <Hour> <Day of the month> <Month> <Day of the week (Optional)>`. |
| `flush_concurrency` | `INT` type. The concurrency for uploading a single file. |

### Example

The following request creates a replication task with an ID of `test5` and `sink_uri` of `blackhome://`.
Expand Down
2 changes: 2 additions & 0 deletions ticdc/ticdc-server-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ data-dir = ""
gc-ttl = 86400 # 24 h
tz = "System"
cluster-id = "default"
# This parameter specifies the maximum memory threshold (in bytes) for tuning GOGC: Setting a smaller threshold increases the GC frequency. Setting a larger threshold reduces GC frequency and consumes more memory resources for the TiCDC process. Once the memory usage exceeds this threshold, GOGC Tuner stops working. The default value is 0, indicating that GOGC Tuner is disabled.
gc-tuner-memory-threshold = 0

[security]
ca-path = ""
Expand Down

0 comments on commit a6d5b58

Please sign in to comment.