Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Catalog API #1934

Merged
merged 5 commits into from
Feb 15, 2021
Merged

Refactor Catalog API #1934

merged 5 commits into from
Feb 15, 2021

Conversation

ChristopheDuong
Copy link
Contributor

@ChristopheDuong ChristopheDuong commented Feb 3, 2021

What

Closes #1929

How

Implements option1 of #1924:

The intermediate layer StandardDataSchema in the back-end is now removed, we can directly convert an AirbyteCatalog object from the Airbyte Protocol into a new AirbyteCatalog object generated from the API that reproduces almost the same fields/structures as the one from the Airbyte Protocol.

The main difference is that the API doesn't make a distinction between an AirbyteCatalog and a ConfiguredAirbyteCatalog: The API AirbyteCatalog has a default AirbyteStreamConfiguration for each stream included.

The new big change is that instead of storing an array of Fields, we now manipulate the JSON Schema objects directly, and thus, it can represent complex structures such as nested schema and arrays, etc

Note that:

  • we also lose the dataType, cleanedName, and selected fields for each fields by doing so. (could be added back inside the Json schema itself?)
  • we keep the cleanedName, and selected fields at the stream level though

Pre-merge Checklist

  • Run integration/Acceptance tests
  • Adapt FE to new API

Recommended reading order

  1. airbyte-api/src/main/openapi/config.yaml
  2. airbyte-server/src/main/java/io/airbyte/server/converters/SchemaConverter.java
  3. the rest

@ChristopheDuong ChristopheDuong marked this pull request as draft February 4, 2021 15:46
Copy link
Contributor

@cgardens cgardens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments but I think they are mostly superficial. Will approve now, but we cannot merge this until corresponding FE changes are made.

io.airbyte.api.model.AirbyteStreamConfiguration result = new AirbyteStreamConfiguration()
.cleanedName(Names.toAlphanumericAndUnderscore(s.getName()))
.cursorField(s.getDefaultCursorField())
.selected(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't like that these converters still aren't symmetrical. this is a fault in the BE data model (not in your implementation). we still lose information as the configuration comes back from the API. can you create an issue to make this symmetrical? i think ideally the backend model should look more like the FE. keeping all of the streams in the list but having a flag as to whether they are going to be synced. it's a bigger change though and we should not try to do it in this PR, but want to make sure we follow up.


final StandardSync standardSync = new StandardSync()
.withConnectionId(connectionId)
.withDestinationId(destinationId)
.withSourceId(sourceId)
.withStatus(StandardSync.Status.ACTIVE)
.withName(CONNECTION_NAME)
.withSchema(schema);
.withSchema(catalog);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be called catalog instead of schema. i'm sympathetic that changing these can be a bit of a pain, so if this is really awful to fix don't worry about it. but since we're tearing everythhing up now, this is as good a time as any.

Copy link
Contributor Author

@ChristopheDuong ChristopheDuong Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry, it's not as painful as changing completely the object underneath haha

(at least on my side, I don't know if this affects @jamakase too much or not)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should affect him since this model is only used behind the API. the only thing that should affect him are changes to the API interface (in config.yaml).

$ref: "#/components/schemas/SourceSchemaStream"
SourceSchemaStream:
$ref: "#/components/schemas/AirbyteStreamAndConfiguration"
AirbyteStreamAndConfiguration:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of prepending entities with Airbyte? Most of the other fields don't have this prefix. What is more, I do not think that it actually brings any meaning to prepend all fields with Airbyte

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, @ChristopheDuong do you think its a good idea to combine different parts of this object with And? What if we add 1 more field here somewhere in future? It means that the current name will be obsolete.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stream is a reserved word in java and so that's why we tend to stick the word Airbyte in front of it. We will need to come up with a better name for this this thing. The and is not ideal, but we just don't have a better name yet.

@ChristopheDuong ChristopheDuong marked this pull request as ready for review February 15, 2021 17:50
@cgardens cgardens merged commit f216f0b into master Feb 15, 2021
@cgardens cgardens deleted the chris/api_catalog branch February 15, 2021 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FE support describing schemas using JsonSchema
3 participants