-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update incremental docs #1364
update incremental docs #1364
Conversation
docs/architecture/incremental.md
Outdated
|
||
In this flavor of incremental records in the warehouse will never be deleted or mutated. A new copy of any new or updated records is appended to the data in the warehouse. This means you can find multiple copies of the same record twice in the warehouse and will need to de-duplicate them yourself. We provided an "at least once" guarantee of replicating each record that is present when the sync runs. | ||
In this flavor of incremental, records in the warehouse will never be deleted or mutated. A copy of each new or updated records is appended to the data in the warehouse. This means you can find multiple copies of the same record twice in the warehouse and will need to de-duplicate them yourself. We provided an "at least once" guarantee of replicating each record that is present when the sync runs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in DBT, we could deduplicate rows in the future
Or point to the user how to do it in the DBT models we generate...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we should write this tutorial! @cgardens do we already have an issue for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no we don't have one yet. please go ahead and make it!
@@ -51,7 +45,7 @@ At the end of this incremental sync the data warehouse would now contain: | |||
|
|||
### Updating a Record | |||
|
|||
Let's assume that our warehouse contains all of the data that it did at the end of the previous section. Now unfortunately the king and queen lose their heads. Let's see that delta: | |||
Let's assume that our warehouse contains all the data that it did at the end of the previous section. Now unfortunately the king and queen lose their heads. Let's see that delta: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💀
docs/architecture/incremental.md
Outdated
@@ -72,6 +66,16 @@ The output we expect to see in the warehouse is as follows. | |||
] | |||
``` | |||
|
|||
## Source-Defined Cursor | |||
|
|||
Some sources are able to determine the cursor that the use without any user input. For example, in the exchange rates api source, the source determines that date field should be used to determine the last record that was synced. In these cases, the source will set the `source_defined_cursor` attribute in the `AirbyteStream` (You can find a more detailed description of the configuration data model [here](catalog.md)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some sources are able to determine the cursor that the use without any user input. For example, in the exchange rates api source, the source determines that date field should be used to determine the last record that was synced. In these cases, the source will set the `source_defined_cursor` attribute in the `AirbyteStream` (You can find a more detailed description of the configuration data model [here](catalog.md)). | |
Some sources are able to determine the cursor to use without any user input. For example, in the exchange rates api source, the source determines that date field should be used to determine the last record that was synced. In these cases, the source will set the `source_defined_cursor` attribute in the `AirbyteStream` (You can find a more detailed description of the configuration data model [here](catalog.md)). |
docs/architecture/incremental.md
Outdated
|
||
Some sources cannot define the cursor without user input. For example, in the postgres source, the user needs to choose which column in a database table they want to user as the `cursor field`. The author of the source cannot predict this. In these cases the user sets the `cursor_field` in the `ConfiguredAirbyteStream`. (You can find a more detailed description of the configuration data model [here](catalog.md)). | ||
|
||
In some cases, the source may propose a `default_cursor_field` in the `AirbyteStream`. When it does, if the user does not specify a `cursor_field` in the `ConfiguredAirbyteStream`, Airbyte will fallback on the default provided by the source. The user is allowed to override the source's `default_cursor_field` by setting the `cursor_field` value in the `ConfiguredAirbyteStream`, but they CANNOT override the `cursor_field` specified in an `AirbyteStream` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What info do I get with the last part of the sentence? Is it smth controversial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm removing. not relevant to someone just using the UI.
* Source Stripe: fix field name for subscription stream * Source Stripe: bump version * Source Stripe: update changelog * #1364 Source Stripe: fix stream schemas * #1364 source Stripe: bump major version * auto-bump connector version --------- Co-authored-by: Denys Davydov <davydov.den18@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
What
How