Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circus Train should handle Avro Table replication out of the box #131

Closed
abhimanyugupta07 opened this issue May 10, 2019 · 0 comments · Fixed by #141
Closed

Circus Train should handle Avro Table replication out of the box #131

abhimanyugupta07 opened this issue May 10, 2019 · 0 comments · Fixed by #141

Comments

@abhimanyugupta07
Copy link
Member

abhimanyugupta07 commented May 10, 2019

Circus Train should be able to detect that a table is an Avro Table with a possibility of an external schema and should trigger the ct-avro transform automatically to copy over the external schema to the replica data lake.

Context

At the moment, we have a circus-train-avro transform which gets triggered only when the following configuration is provided in the CT config file:

transform-options:
    avro-serde-options:
      base-url: s3://shunting-yard-target/bdp/abhi_avro_test

If the configuration is not provided, CT treats the replication as a usual replication and as a result the replica table has the parameter avro.schema.url which is pointing to the source table's schema location which is not correct.

Proposed solution:

CT should be able to detect that the table which is being replicated is a Avro Table and hence should trigger the ct-avro transform and use the table's location as a default location for the schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant