Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix temporal type default value bug postgres #15877

Merged
merged 8 commits into from
Aug 23, 2022

Conversation

subodh1810
Copy link
Contributor

@subodh1810 subodh1810 commented Aug 23, 2022

Issue : #15840
The bug is happening for date, timestamp (without time zone) and time without timezone data types with default values. While building the internal schema of the columns, debezium is failing with the error

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@subodh1810 subodh1810 marked this pull request as ready for review August 23, 2022 09:41
@subodh1810 subodh1810 requested a review from a team as a code owner August 23, 2022 09:41
@subodh1810
Copy link
Contributor Author

subodh1810 commented Aug 23, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/2910500256
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/2910500256
No Python unittests run

Build Passed

Test summary info:

All Passed

Copy link
Contributor

@edgao edgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple minor/clarifying comments, lgtm otherwise


public static String convertToTimeWithTimezone(final Object time) {
if (time instanceof final java.time.OffsetTime timetz) {
return timetz.format(TIMETZ_FORMATTER);
} else {
if (!loggedUnknownTimeWithTimeZoneClass) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this! Maybe it's worth asking infra to set an alert if this shows up in cloud (or maybe push oss telemetry)

"'infinity'",
"'-infinity'")
.addExpectedValues(
"+294247-01-10T04:00:25.200000",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, debezium

what type is debezium giving to us in this case? I think it would be nice to have a comment explaining why this is different from the snapshot test

} else {
if (!loggedUnknownTimestampClass) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong but this logic is supposed to say something like, if in the else case this is indicative of an unknownTimestampClass and there is a negation here to only log this unknown once, right?

If so, what's the reason for limiting multiple logging instances? This logic wasn't initially clear why the loggedUnknownTime... variables were set to false and then set to true but it seems that when an loggedUnknownTime... variable has been set to true that indicates an unknownTime... was seen, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I only want to log it once is to avoid the scenario where we have a million of these logs! That could happen in case if a table has a million records and debezium starts sending a different class type for those records

@subodh1810 subodh1810 temporarily deployed to more-secrets August 23, 2022 16:37 Inactive
@subodh1810
Copy link
Contributor Author

subodh1810 commented Aug 23, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/2913266296
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/2913266296
No Python unittests run

Build Passed

Test summary info:

All Passed

"13:00:00.000000")
.build());
}

// time without time zone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: it seems this comment is the same as the test above and this tests for null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hah I just copied the above test and replaced the values with null. the idea is to test whether columns with time without time zone data type and null values sync properly or not

@subodh1810
Copy link
Contributor Author

subodh1810 commented Aug 23, 2022

/publish connector=connectors/source-postgres

🕑 Publishing the following connectors:
connectors/source-postgres
https://github.com/airbytehq/airbyte/actions/runs/2913582495


Connector Did it publish? Were definitions generated?
connectors/source-postgres

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@subodh1810
Copy link
Contributor Author

subodh1810 commented Aug 23, 2022

/publish connector=connectors/source-postgres-strict-encrypt

🕑 Publishing the following connectors:
connectors/source-postgres-strict-encrypt
https://github.com/airbytehq/airbyte/actions/runs/2913583322


Connector Did it publish? Were definitions generated?
connectors/source-postgres-strict-encrypt

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@subodh1810 subodh1810 temporarily deployed to more-secrets August 23, 2022 17:39 Inactive
Copy link
Contributor

@ryankfu ryankfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, overall like the ability to log these unknown types as well

As Ed has noted, it may be useful to have these log messages surfaces into something where it can be tracked for cloud users

@subodh1810 subodh1810 merged commit 828285d into master Aug 23, 2022
@subodh1810 subodh1810 deleted the fix-temporal-type-default-value-bug-postgres- branch August 23, 2022 18:30
rodireich pushed a commit that referenced this pull request Aug 25, 2022
* fix temporal datatype bug for columns with default in postgres cdc

* fix test

* add test for date and time as well

* add more logs for unknown classes

* review comments

* bump version

* auto-bump connector version [ci skip]

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants