-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add previously missing cursor types to JDBC utils. #2600
Conversation
/test connector=connectors/source-mssql
|
/test connector=connectors/source-mysql
|
/test connector=connectors/source-postgres
|
/test connector=connectors/source-redshift
|
Oh great, since you are tackling this, could you maybe add support for Thanks! |
@@ -2,6 +2,6 @@ | |||
"sourceDefinitionId": "decd338e-5647-4c0b-adf4-da0e75f5a750", | |||
"name": "Postgres", | |||
"dockerRepository": "airbyte/source-postgres", | |||
"dockerImageTag": "0.2.1", | |||
"dockerImageTag": "0.2.2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's something we don't handle very well at the moment is conflict in connector's versions being bumped in multiple branches at the same time... I'm not sure how we should resolve this though.
So when you'll try to bump versions of those connectors, it's going to fail because 0.2.2 has already been published but not merged yet...:
https://hub.docker.com/r/airbyte/source-postgres/tags?page=1&ordering=name
coming from #2460
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense. will coordinate with artem!
.put("bigint", JsonSchemaPrimitive.NUMBER) | ||
.put("smallint", JsonSchemaPrimitive.INTEGER) | ||
.put("int", JsonSchemaPrimitive.INTEGER) | ||
.put("bigint", JsonSchemaPrimitive.INTEGER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we expect destinations to handle all INTEGER
as longs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do we need to add integer in this PR?
- How good to we feel that everything still works after adding a new type? Just running the tests is insufficient in this case because none of the tests are testing for integer type. So it seems like we need to add tests for sources, destinations, etc to make sure that this gets supported sanely. We can do this, but that's a bigger PR (which goes back to question 1).
My feeling is as is, this is yoloing a kinda big change. So we should know why it is valuable right now and understand why we are yoloing it versus doing it more carefully with adequate testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
normalization is converting a JsonSchemaPrimitive.INTEGER
into dbt_utils.type_bigint
(the largest integer available as we don't have info on the size of the integer anymore)
{% macro default__type_bigint() %}
bigint
{% endmacro %}
{% macro bigquery__type_bigint() %}
int64
{% endmacro %}
default -> throw new IllegalArgumentException(String.format("%s is not supported.", cursorFieldType)); | ||
} | ||
} | ||
|
||
// the switch statement intentionally has duplicates so that its structure matches the type switch | ||
// statement above. | ||
|
||
// these json type fields are eventually consumed by the normalization process (if configured). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does something need to change in normalization to support the new primitive type INTEGER
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, other sources were already producing integer
columns as json primitives, this change is only for JDBC based source in java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think more broadly the question is just what needs to change downstream to support the integer type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this may be a little puritanical of me, but it seems a little odd to have a comment about normalization in this utils class. feel free to keep it if you feel it's helpful though. (2 of 10 on the scale)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In normalization, it is already handling integer
fields.
For example facebook catalog produces such as this one:
airbyte/airbyte-integrations/bases/base-normalization/unit_tests/resources/facebook_catalog.json
Line 104 in f5094b5
"age_max": { "type": ["null", "integer"] }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the nvarchar parts look good to me. see comments below that I am wary about doing the integer piece. curious to understand if we need to do that now.
.put("bigint", JsonSchemaPrimitive.NUMBER) | ||
.put("smallint", JsonSchemaPrimitive.INTEGER) | ||
.put("int", JsonSchemaPrimitive.INTEGER) | ||
.put("bigint", JsonSchemaPrimitive.INTEGER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do we need to add integer in this PR?
- How good to we feel that everything still works after adding a new type? Just running the tests is insufficient in this case because none of the tests are testing for integer type. So it seems like we need to add tests for sources, destinations, etc to make sure that this gets supported sanely. We can do this, but that's a bigger PR (which goes back to question 1).
My feeling is as is, this is yoloing a kinda big change. So we should know why it is valuable right now and understand why we are yoloing it versus doing it more carefully with adequate testing.
default -> throw new IllegalArgumentException(String.format("%s is not supported.", cursorFieldType)); | ||
} | ||
} | ||
|
||
// the switch statement intentionally has duplicates so that its structure matches the type switch | ||
// statement above. | ||
|
||
// these json type fields are eventually consumed by the normalization process (if configured). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think more broadly the question is just what needs to change downstream to support the integer type?
default -> throw new IllegalArgumentException(String.format("%s is not supported.", cursorFieldType)); | ||
} | ||
} | ||
|
||
// the switch statement intentionally has duplicates so that its structure matches the type switch | ||
// statement above. | ||
|
||
// these json type fields are eventually consumed by the normalization process (if configured). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this may be a little puritanical of me, but it seems a little odd to have a comment about normalization in this utils class. feel free to keep it if you feel it's helpful though. (2 of 10 on the scale)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked offline. Agreed to separate out the integer stuff into a separate project. Approving because once that's done this can be merged. Don't want to block due to time zone.
794796c
to
dcbbc7b
Compare
/test connector=connectors/source-mssql
|
/test connector=connectors/source-mysql
|
/test connector=connectors/source-postgres
|
/test connector=connectors/source-redshift
|
/publish connector=connectors/source-mysql
|
/publish connector=connectors/source-postgres
|
/test connector=connectors/source-mssql
|
/publish connector=connectors/source-mssql
|
/publish connector=connectors/destination-redshift
|
/publish connector=connectors/destination-postgres
|
/publish connector=connectors/destination-snowflake
|
/publish connector=connectors/source-redshift
|
What
Incremental syncs using NVARCHAR as the key was previously failing since we did not recognise the type.
This was reported by a user.
Take the chance to fix wrong casting of JDBC integers to JSON Schema Numbers type, that is resulting in the data being transformed as a float after normalisation.
How
Add the missing types in.
Bump all connector versions that use the class:
Pre-merge Checklist
Recommended reading order