Add previously missing cursor types to JDBC utils. #2600

davinchia · 2021-03-24T07:34:10Z

What

Incremental syncs using NVARCHAR as the key was previously failing since we did not recognise the type.

This was reported by a user.

Take the chance to fix wrong casting of JDBC integers to JSON Schema Numbers type, that is resulting in the data being transformed as a float after normalisation.

How

Add the missing types in.

Bump all connector versions that use the class:

source-mssql
source-mysql
source-redshift
source-postgres
destination-postgres
desintation-redshift
destination-snowflake

Pre-merge Checklist

Run integration tests
Publish Docker images

Recommended reading order

JdbcUtils.java

davinchia · 2021-03-24T08:55:03Z

/test connector=connectors/source-mssql

🕑 connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/682454352
✅ connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/682454352

davinchia · 2021-03-24T08:57:10Z

/test connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/682459951
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/682459951

davinchia · 2021-03-24T08:57:20Z

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/682460279
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/682460279

davinchia · 2021-03-24T08:57:26Z

/test connector=connectors/source-redshift

🕑 connectors/source-redshift https://github.com/airbytehq/airbyte/actions/runs/682460999
✅ connectors/source-redshift https://github.com/airbytehq/airbyte/actions/runs/682460999

ChristopheDuong · 2021-03-24T09:08:23Z

Oh great, since you are tackling this, could you maybe add support for integer as well?

Thanks!

#1006 (comment)

ChristopheDuong · 2021-03-24T09:12:25Z

...c/main/resources/config/STANDARD_SOURCE_DEFINITION/decd338e-5647-4c0b-adf4-da0e75f5a750.json

@@ -2,6 +2,6 @@
  "sourceDefinitionId": "decd338e-5647-4c0b-adf4-da0e75f5a750",
  "name": "Postgres",
  "dockerRepository": "airbyte/source-postgres",
-  "dockerImageTag": "0.2.1",
+  "dockerImageTag": "0.2.2",


I think there's something we don't handle very well at the moment is conflict in connector's versions being bumped in multiple branches at the same time... I'm not sure how we should resolve this though.

So when you'll try to bump versions of those connectors, it's going to fail because 0.2.2 has already been published but not merged yet...:

https://hub.docker.com/r/airbyte/source-postgres/tags?page=1&ordering=name

coming from #2460

that makes sense. will coordinate with artem!

jrhizor · 2021-03-24T17:59:25Z

airbyte-db/src/test/java/io/airbyte/db/jdbc/TestJdbcUtils.java

-        .put("bigint", JsonSchemaPrimitive.NUMBER)
+        .put("smallint", JsonSchemaPrimitive.INTEGER)
+        .put("int", JsonSchemaPrimitive.INTEGER)
+        .put("bigint", JsonSchemaPrimitive.INTEGER)


Does this mean we expect destinations to handle all INTEGER as longs?

Do we need to add integer in this PR?

How good to we feel that everything still works after adding a new type? Just running the tests is insufficient in this case because none of the tests are testing for integer type. So it seems like we need to add tests for sources, destinations, etc to make sure that this gets supported sanely. We can do this, but that's a bigger PR (which goes back to question 1).

My feeling is as is, this is yoloing a kinda big change. So we should know why it is valuable right now and understand why we are yoloing it versus doing it more carefully with adequate testing.

normalization is converting a JsonSchemaPrimitive.INTEGER into dbt_utils.type_bigint (the largest integer available as we don't have info on the size of the integer anymore)

{% macro default__type_bigint() %} bigint {% endmacro %} {% macro bigquery__type_bigint() %} int64 {% endmacro %}

from https://github.com/fishtown-analytics/dbt-utils/blob/ceb28497769c642cae7e3d5d18f1fe6bb253ef59/macros/cross_db_utils/datatypes.sql#L71

jrhizor · 2021-03-24T18:01:30Z

airbyte-db/src/main/java/io/airbyte/db/jdbc/JdbcUtils.java

      default -> throw new IllegalArgumentException(String.format("%s is not supported.", cursorFieldType));
    }
  }

  // the switch statement intentionally has duplicates so that its structure matches the type switch
  // statement above.
-
+  // these json type fields are eventually consumed by the normalization process (if configured).


Does something need to change in normalization to support the new primitive type INTEGER?

No, other sources were already producing integer columns as json primitives, this change is only for JDBC based source in java

i think more broadly the question is just what needs to change downstream to support the integer type?

this may be a little puritanical of me, but it seems a little odd to have a comment about normalization in this utils class. feel free to keep it if you feel it's helpful though. (2 of 10 on the scale)

In normalization, it is already handling integer fields.

For example facebook catalog produces such as this one:

airbyte/airbyte-integrations/bases/base-normalization/unit_tests/resources/facebook_catalog.json

Line 104 in f5094b5

"age_max": { "type": ["null", "integer"] },

cgardens

the nvarchar parts look good to me. see comments below that I am wary about doing the integer piece. curious to understand if we need to do that now.

cgardens · 2021-03-24T18:11:17Z

airbyte-db/src/test/java/io/airbyte/db/jdbc/TestJdbcUtils.java

-        .put("bigint", JsonSchemaPrimitive.NUMBER)
+        .put("smallint", JsonSchemaPrimitive.INTEGER)
+        .put("int", JsonSchemaPrimitive.INTEGER)
+        .put("bigint", JsonSchemaPrimitive.INTEGER)


Do we need to add integer in this PR?

How good to we feel that everything still works after adding a new type? Just running the tests is insufficient in this case because none of the tests are testing for integer type. So it seems like we need to add tests for sources, destinations, etc to make sure that this gets supported sanely. We can do this, but that's a bigger PR (which goes back to question 1).

My feeling is as is, this is yoloing a kinda big change. So we should know why it is valuable right now and understand why we are yoloing it versus doing it more carefully with adequate testing.

cgardens · 2021-03-24T18:14:43Z

airbyte-db/src/main/java/io/airbyte/db/jdbc/JdbcUtils.java

      default -> throw new IllegalArgumentException(String.format("%s is not supported.", cursorFieldType));
    }
  }

  // the switch statement intentionally has duplicates so that its structure matches the type switch
  // statement above.
-
+  // these json type fields are eventually consumed by the normalization process (if configured).


i think more broadly the question is just what needs to change downstream to support the integer type?

cgardens · 2021-03-24T18:14:57Z

airbyte-db/src/main/java/io/airbyte/db/jdbc/JdbcUtils.java

      default -> throw new IllegalArgumentException(String.format("%s is not supported.", cursorFieldType));
    }
  }

  // the switch statement intentionally has duplicates so that its structure matches the type switch
  // statement above.
-
+  // these json type fields are eventually consumed by the normalization process (if configured).


this may be a little puritanical of me, but it seems a little odd to have a comment about normalization in this utils class. feel free to keep it if you feel it's helpful though. (2 of 10 on the scale)

cgardens

Talked offline. Agreed to separate out the integer stuff into a separate project. Approving because once that's done this can be merged. Don't want to block due to time zone.

davinchia · 2021-03-25T03:49:09Z

@chris I'm going to separate this out for now to make this change cleaner. Let's talk later on how to approach this from a test perspective later today. Looks like we might have other types to add as well.

I will redo the docker versions after #2460 is merged in.

…or-type

davinchia · 2021-03-28T13:32:59Z

/test connector=connectors/source-mssql

🕑 connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/695152731
✅ connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/695152731

davinchia · 2021-03-28T13:33:11Z

/test connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/695153373
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/695153373

davinchia · 2021-03-28T13:33:24Z

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/695153562
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/695153562

davinchia · 2021-03-28T13:33:29Z

/test connector=connectors/source-redshift

🕑 connectors/source-redshift https://github.com/airbytehq/airbyte/actions/runs/695153616
✅ connectors/source-redshift https://github.com/airbytehq/airbyte/actions/runs/695153616

davinchia · 2021-03-28T13:43:23Z

/publish connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/695169703
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/695169703

davinchia · 2021-03-28T13:43:36Z

/publish connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/695170052
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/695170052

davinchia · 2021-03-28T13:43:56Z

/test connector=connectors/source-mssql

🕑 connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/695170592
❌ connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/695170592

davinchia · 2021-03-28T13:48:32Z

/publish connector=connectors/source-mssql

🕑 connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/695178337
✅ connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/695178337

davinchia · 2021-03-28T13:55:13Z

/publish connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/695188118
✅ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/695188118

davinchia · 2021-03-28T13:55:34Z

/publish connector=connectors/destination-postgres

🕑 connectors/destination-postgres https://github.com/airbytehq/airbyte/actions/runs/695188386
✅ connectors/destination-postgres https://github.com/airbytehq/airbyte/actions/runs/695188386

davinchia · 2021-03-28T13:55:43Z

/publish connector=connectors/destination-snowflake

🕑 connectors/destination-snowflake https://github.com/airbytehq/airbyte/actions/runs/695188496
✅ connectors/destination-snowflake https://github.com/airbytehq/airbyte/actions/runs/695188496

davinchia · 2021-03-28T14:17:54Z

/publish connector=connectors/source-redshift

🕑 connectors/source-redshift https://github.com/airbytehq/airbyte/actions/runs/695223589
✅ connectors/source-redshift https://github.com/airbytehq/airbyte/actions/runs/695223589

Add previously missing cursor types to JDBC utils.

bda0f3e

davinchia requested a review from ChristopheDuong March 24, 2021 07:34

davinchia marked this pull request as ready for review March 24, 2021 07:38

auto-assign bot requested review from jrhizor and michel-tricot March 24, 2021 07:38

davinchia removed the request for review from michel-tricot March 24, 2021 07:38

Davin Chia added 3 commits March 24, 2021 16:41

Update source dockerfiles.

20baaee

Update the config and seed files.

b026063

Bump destination version for Postgres, Redshift and Snowflake,

dcbbc7b

ChristopheDuong reviewed Mar 24, 2021

View reviewed changes

ChristopheDuong approved these changes Mar 24, 2021

View reviewed changes

jrhizor reviewed Mar 24, 2021

View reviewed changes

cgardens requested changes Mar 24, 2021

View reviewed changes

cgardens approved these changes Mar 25, 2021

View reviewed changes

davinchia force-pushed the davinchia/fix-missing-cursor-type branch from 794796c to dcbbc7b Compare March 25, 2021 04:01

Davin Chia added 2 commits March 28, 2021 21:27

Merge remote-tracking branch 'origin' into davinchia/fix-missing-curs…

d22cc34

…or-type

Update all related source versions.

b02ee5f

Update all related destination versions.

9f9b125

davinchia merged commit e8190ff into master Mar 29, 2021

davinchia deleted the davinchia/fix-missing-cursor-type branch March 29, 2021 00:09

karinakuz added connectors/destination/bigquery connectors/destinations-api connectors/destinations-warehouse connectors/destination/snowflake and removed connectors/destinations-api labels Jan 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add previously missing cursor types to JDBC utils. #2600

Add previously missing cursor types to JDBC utils. #2600

davinchia commented Mar 24, 2021 •

edited

Loading

davinchia commented Mar 24, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 24, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 24, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 24, 2021 •

edited by github-actions bot

Loading

ChristopheDuong commented Mar 24, 2021

ChristopheDuong Mar 24, 2021 •

edited

Loading

davinchia Mar 24, 2021

jrhizor Mar 24, 2021

cgardens Mar 24, 2021

ChristopheDuong Mar 24, 2021

jrhizor Mar 24, 2021

ChristopheDuong Mar 24, 2021 •

edited

Loading

cgardens Mar 24, 2021

cgardens Mar 24, 2021

ChristopheDuong Mar 24, 2021

cgardens left a comment

cgardens Mar 24, 2021

cgardens Mar 24, 2021

cgardens Mar 24, 2021

cgardens left a comment

davinchia commented Mar 25, 2021 •

edited

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

Add previously missing cursor types to JDBC utils. #2600

Add previously missing cursor types to JDBC utils. #2600

Conversation

davinchia commented Mar 24, 2021 • edited Loading

What

How

Pre-merge Checklist

Recommended reading order

davinchia commented Mar 24, 2021 • edited by github-actions bot Loading

davinchia commented Mar 24, 2021 • edited by github-actions bot Loading

davinchia commented Mar 24, 2021 • edited by github-actions bot Loading

davinchia commented Mar 24, 2021 • edited by github-actions bot Loading

ChristopheDuong commented Mar 24, 2021

ChristopheDuong Mar 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChristopheDuong Mar 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgardens left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgardens left a comment

Choose a reason for hiding this comment

davinchia commented Mar 25, 2021 • edited Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 28, 2021 • edited by github-actions bot Loading

davinchia commented Mar 24, 2021 •

edited

Loading

davinchia commented Mar 24, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 24, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 24, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 24, 2021 •

edited by github-actions bot

Loading

ChristopheDuong Mar 24, 2021 •

edited

Loading

ChristopheDuong Mar 24, 2021 •

edited

Loading

davinchia commented Mar 25, 2021 •

edited

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading

davinchia commented Mar 28, 2021 •

edited by github-actions bot

Loading