Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[App Update] Add the logic to ensure tap-postgres always return 1 row #8

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

quocnguyendinh
Copy link

https://kaligo.atlassian.net/browse/LOYAL-12052

Background

Recently, we are facing the replication issue on the UAT RC-US (this)
Most of the checks are related to the table activity_trackings

This is because sdc_batched_at of this table hasn’t been updated during the elt-run (See this log)
In the log above, the result of the replication query did not return any row while logically, it should return at least one row because the replication query is in the format:

SELECT ... FROM WHERE updated_at >= { bookmark_timestamp } - lookback-window

The hypothesis is that:

In the time replication happened, the source Team has deleted recent rows whose updated_at >= { bookmark_timestamp } - lookback-window
That’s why the replication query return the empty result. Thus, target-redshift won’t update sdc_batched . That’s why the freshness-check failed.

Design

Add a logic in the replication process that:

  • if the result of the query is empty, we can select the latest row of the table to guarantee it will always return one row if that table is not empty.

Impact

With this implementation, it will ensure that in every replication at least one row would be returned if that table is empty

Caveats

None

Testing

None

Docs

None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant