Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avro: Add name strategy option for schema registry #3936

Closed
wants to merge 10 commits into from

Commits on Dec 17, 2021

  1. avro: Add name strategy option for schema registry

    This is for issue pingcap#1147
    
    When creating a changefeed with `--opts nameStrategy=topic` this now
    uses the topic name as basis for the schema registry. Before this PR
    the schema registry always used `{schema}_{table}` as basis.
    
    This solves the case where KSQL expect the topic name to be used as
    basis for the schema registry.
    
    == Example
    
    Setup a test environment and create the changefeed
    
    ```
    tiup playground --without-monitor --tiflash 0 v5.3.0
    confluent local services start
    ./bin/cdc server
    ./bin/cdc cli changefeed create --no-confirm \
    --changefeed-id="simple-replication-task" \
    --sort-engine="unified" \
    --sink-uri="kafka://127.0.0.1:9092/cdctest?protocol=avro&kafka-version=2.8.0" \
    --opts registry="http://127.0.0.1:8081" \
    --opts nameStrategy=topic
    ```
    
    To verify the `nameStrategy`:
    ```
    $ ./bin/cdc cli changefeed query -c simple-replication-task | jq '.info.opts.nameStrategy'
    "topic"
    ```
    
    Now let's create a table that uses this changefeed:
    ```
    CREATE TABLE t1 (id INT PRIMARY KEY, n VARCHAR(200) NOT NULL, ts TIMESTAMP(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6));
    INSERT INTO t1(id, n) VALUES (1, 'test');
    ```
    
    This now shows up in KSQL like this:
    ```
    ksql> SHOW TOPICS;
    
     Kafka Topic                 | Partitions | Partition Replicas
    ---------------------------------------------------------------
     cdctest                     | 3          | 1
     default_ksql_processing_log | 1          | 1
    ---------------------------------------------------------------
    ksql> PRINT cdctest FROM BEGINNING;
    Key format: AVRO or KAFKA_STRING
    Value format: AVRO
    rowtime: 2021/12/17 07:06:07.227 Z, key: {"id": 1}, value: {"id": 1, "n": "test", "ts": 1639724765242}, partition: 0
    ^CTopic printing ceased
    ```
    
    And in the schema registry:
    ```
    $ curl -s http://127.0.0.1:8081/subjects | jq
    [
      "cdctest-key",
      "cdctest-value"
    ]
    ```
    
    This now allows us to do this:
    ```
    ksql> CREATE STREAM t1str WITH (KAFKA_TOPIC='cdctest', VALUE_FORMAT='avro');
    
     Message
    ----------------
     Stream created
    ----------------
    
    ksql> DESCRIBE t1str;
    
    Name                 : T1STR
     Field | Type
    -------------------------
     ID    | INTEGER
     N     | VARCHAR(STRING)
     TS    | TIMESTAMP
    -------------------------
    For runtime statistics and query details run: DESCRIBE <Stream,Table> EXTENDED;
    ```
    
    Note that we didnt' have to specify the columns in the `CREATE STREAM`
    comman as these were fetched from the schema registry.
    
    See also:
    - https://docs.ksqldb.io/en/latest/operate-and-deploy/schema-registry-integration/#schema-inference
    dveeden committed Dec 17, 2021
    Configuration menu
    Copy the full SHA
    e56fbaf View commit details
    Browse the repository at this point in the history
  2. Updated based on review

    dveeden committed Dec 17, 2021
    Configuration menu
    Copy the full SHA
    15290e0 View commit details
    Browse the repository at this point in the history

Commits on Dec 20, 2021

  1. Configuration menu
    Copy the full SHA
    ac39db1 View commit details
    Browse the repository at this point in the history
  2. Fix tests

    dveeden committed Dec 20, 2021
    Configuration menu
    Copy the full SHA
    b488245 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2186a29 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    e222a61 View commit details
    Browse the repository at this point in the history
  5. Fix lint issue

    dveeden committed Dec 20, 2021
    Configuration menu
    Copy the full SHA
    037e837 View commit details
    Browse the repository at this point in the history

Commits on Dec 30, 2021

  1. Configuration menu
    Copy the full SHA
    ce3cae0 View commit details
    Browse the repository at this point in the history

Commits on Jan 5, 2022

  1. Configuration menu
    Copy the full SHA
    20e9978 View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2022

  1. Configuration menu
    Copy the full SHA
    6292b80 View commit details
    Browse the repository at this point in the history