From 1768fc9b9bf6f4461e0180985cb8234563904180 Mon Sep 17 00:00:00 2001 From: Anonymitaet Date: Wed, 18 Sep 2019 16:48:35 +0800 Subject: [PATCH 1/3] Add *Kinesis source connector guide* --- site2/docs/io-kinesis-source.md | 62 +++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) create mode 100644 site2/docs/io-kinesis-source.md diff --git a/site2/docs/io-kinesis-source.md b/site2/docs/io-kinesis-source.md new file mode 100644 index 0000000000000..a5bf27d140c3e --- /dev/null +++ b/site2/docs/io-kinesis-source.md @@ -0,0 +1,62 @@ +--- +id: io-kinesis-source +title: Kinesis source connector +sidebar_label: Kinesis source connector +--- + +The Kinesis source connector pulls data from Amazon Kinesis and persists data into Pulsar. + +## Configuration + +The configuration of the Kinesis source connector has the following parameters. + +### Parameter + +| Name | Type|Required | Default | Description +|------|----------|----------|---------|-------------| +`initialPositionInStream`|InitialPositionInStream|false|LATEST|The position where the connector starts from.

Below are the available options:

  • `AT_TIMESTAMP`: start from the record at or after the specified timestamp.

  • `LATEST`: start after the most recent data record.

  • `TRIM_HORIZON`: start from the oldest available data record. +`startAtTime`|Date|false|" " (empty string)|If set to `AT_TIMESTAMP`, it specifies the point in time to start consumption. +`applicationName`|String|false|Pulsar IO connector|The name of the Amazon Kinesis application.

    By default, the application name is included in the user agent string used to make AWS requests. This can assist with troubleshooting, for example, distinguish requests made by separate connector instances. +`checkpointInterval`|long|false|60000|The frequency of the Kinesis stream checkpoint in milliseconds. +`backoffTime`|long|false|3000|The amount of time to delay between requests when the connector encounters a throttling exception from AWS Kinesis in milliseconds. +`numRetries`|int|false|3|The number of re-attempts when the connector encounters an exception while trying to set a checkpoint. +`receiveQueueSize`|int|false|1000|The maximum number of AWS records that can be buffered inside the connector.

    Once the `receiveQueueSize` is reached, the connector does not consume any messages from Kinesis until some messages in the queue are successfully consumed. + +### Example + +Before using the Kinesis source connector, you need to create a configuration file through one of the following methods. + +* JSON + + ```json + { + "awsEndpoint": "https://some.endpoint.aws", + "awsRegion": "us-east-1", + "awsKinesisStreamName": "my-stream", + "awsCredentialPluginParam": "{\"accessKey\":\"myKey\",\"secretKey\":\"my-Secret\"}", + "applicationName": "My test application", + "checkpointInterval": "30000", + "backoffTime": "4000", + "numRetries": "3", + "receiveQueueSize": 2000, + "initialPositionInStream": "TRIM_HORIZON", + "startAtTime": "2019-03-05T19:28:58.000Z" + } + ``` + +* YAML + + ```yaml + configs: + awsEndpoint: "https://some.endpoint.aws" + awsRegion: "us-east-1" + awsKinesisStreamName: "my-stream" + awsCredentialPluginParam: "{\"accessKey\":\"myKey\",\"secretKey\":\"my-Secret\"}" + applicationName: "My test application" + checkpointInterval: "30000" + backoffTime: "4000" + numRetries: "3" + receiveQueueSize: 2000 + initialPositionInStream: "TRIM_HORIZON" + startAtTime: "2019-03-05T19:28:58.000Z" + ``` \ No newline at end of file From d8715e2407a7bb0ed54456b21728b9d53ba73cc7 Mon Sep 17 00:00:00 2001 From: Anonymitaet Date: Wed, 18 Sep 2019 17:37:03 +0800 Subject: [PATCH 2/3] update --- site2/docs/io-kinesis-source.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/site2/docs/io-kinesis-source.md b/site2/docs/io-kinesis-source.md index a5bf27d140c3e..b26c3c84d0270 100644 --- a/site2/docs/io-kinesis-source.md +++ b/site2/docs/io-kinesis-source.md @@ -21,6 +21,11 @@ The configuration of the Kinesis source connector has the following parameters. `backoffTime`|long|false|3000|The amount of time to delay between requests when the connector encounters a throttling exception from AWS Kinesis in milliseconds. `numRetries`|int|false|3|The number of re-attempts when the connector encounters an exception while trying to set a checkpoint. `receiveQueueSize`|int|false|1000|The maximum number of AWS records that can be buffered inside the connector.

    Once the `receiveQueueSize` is reached, the connector does not consume any messages from Kinesis until some messages in the queue are successfully consumed. +`awsEndpoint`|String|false|" " (empty string)|The Kinesis end-point URL, which can be found at [here](https://docs.aws.amazon.com/general/latest/gr/rande.html). +`awsRegion`|String|false|" " (empty string)|The AWS region.

    **Example**
    us-west-1, us-west-2 +`awsKinesisStreamName`|String|true|" " (empty string)|The Kinesis stream name. +`awsCredentialPluginName`|String|false|" " (empty string)|The fully-qualified class name of implementation of {@inject: github:`AwsCredentialProviderPlugin`:/pulsar-io/kinesis/src/main/java/org/apache/pulsar/io/kinesis/AwsCredentialProviderPlugin.java}.

    It is a factory class which creates an AWSCredentialsProvider that is used by Kinesis sink.

    If it is empty, the Kinesis sink creates a default AWSCredentialsProvider which accepts json-map of credentials in `awsCredentialPluginParam`. +`awsCredentialPluginParam`|String |false|" " (empty string)|The JSON parameter to initialize `awsCredentialsProviderPlugin`. ### Example From 5d773d068767631283349e5a57b265c0bbc661a6 Mon Sep 17 00:00:00 2001 From: Anonymitaet <50226895+Anonymitaet@users.noreply.github.com> Date: Mon, 23 Sep 2019 17:33:04 +0800 Subject: [PATCH 3/3] Update --- site2/docs/io-kinesis-source.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/site2/docs/io-kinesis-source.md b/site2/docs/io-kinesis-source.md index b26c3c84d0270..e938367f53058 100644 --- a/site2/docs/io-kinesis-source.md +++ b/site2/docs/io-kinesis-source.md @@ -58,10 +58,10 @@ Before using the Kinesis source connector, you need to create a configuration fi awsKinesisStreamName: "my-stream" awsCredentialPluginParam: "{\"accessKey\":\"myKey\",\"secretKey\":\"my-Secret\"}" applicationName: "My test application" - checkpointInterval: "30000" - backoffTime: "4000" - numRetries: "3" + checkpointInterval: 30000 + backoffTime: 4000 + numRetries: 3 receiveQueueSize: 2000 initialPositionInStream: "TRIM_HORIZON" startAtTime: "2019-03-05T19:28:58.000Z" - ``` \ No newline at end of file + ```