-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve error message for tables with invalid columns as cursor #15317
Conversation
/test connector=connectors/source-postgres
Build FailedTest summary info:
|
/test connector=connectors/source-mysql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, added a few minor comments.
@Override | ||
public boolean isValidCursorType(final MysqlType cursorType) { | ||
return switch (cursorType) { | ||
case BIT, BOOLEAN, TINYINT, TINYINT_UNSIGNED, SMALLINT, SMALLINT_UNSIGNED, MEDIUMINT, MEDIUMINT_UNSIGNED, INT, INT_UNSIGNED, BIGINT, BIGINT_UNSIGNED, FLOAT, FLOAT_UNSIGNED, DOUBLE, DOUBLE_UNSIGNED, DECIMAL, DECIMAL_UNSIGNED, DATE, DATETIME, TIMESTAMP, TIME, YEAR, CHAR, VARCHAR, TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT, ENUM, SET, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB, BINARY, VARBINARY -> true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think of this style?
return switch (cursorType) {
case <mysql-specific types> -> true;
default -> super.isValidCursorType(cursorType);
};
I.e. only defining the types that aren't handled by JdbcSourceOperations already. (not a huge win in this case, but I think it would make e.g. getJsonType much nicer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach as well, reduces replication of cursor types but it does make it harder to know the full list of supported cursor types without looking into the super class. For extensibility though this seems better
...rs/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/PostgresSource.java
Outdated
Show resolved
Hide resolved
public record InvalidCursorInfo(String tableName, String cursorColumnName, String cursorSqlType) { | ||
|
||
@Override | ||
public String toString() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiny nitpick: I'd prefer to define a new method prettyString
instead of overriding toString (mostly in case someone actually wants to print one of these out for debugging or something)
public class InvalidCursorException extends RuntimeException { | ||
|
||
public InvalidCursorException(final List<InvalidCursorInfo> tablesWithInvalidCursor) { | ||
super("The following tables have invalid columns selected as cursor, please select a valid column as a cursor. " + tablesWithInvalidCursor.stream().map(InvalidCursorInfo::toString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: maybe replace "a valid column" with "a column with a well-defined ordering"? So that it's clear why the cursor isn't valid
stream.getName()); | ||
final boolean hasSourceDefinedCursor = | ||
!Objects.isNull(airbyteStream.getStream().getSourceDefinedCursor()) && airbyteStream.getStream().getSourceDefinedCursor(); | ||
if (!tableNameToTable.containsKey(fullyQualifiedTableName) || airbyteStream.getSyncMode() != SyncMode.INCREMENTAL || hasSourceDefinedCursor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my understanding: is there any situation where tableNameToTable.containsKey(fullyQualifiedTableName)
isn't true? (totally fair to have this condition either way, I'm just curious how tableNameToTable gets populated)
.findFirst() | ||
.orElseThrow(); | ||
|
||
if (isValidCursorType(cursorType)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: this seems a little bit nicer (avoids using continue
)
if (!isValidCursorType(cursorType)) {
tablesWithInvalidCursor.add(new InvalidCursorInfo(fullyQualifiedTableName, cursorField, cursorType.toString()));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on this comment
@@ -142,6 +146,41 @@ public AutoCloseableIterator<AirbyteMessage> read(final JsonNode config, | |||
}); | |||
} | |||
|
|||
private void validateCursorFieldForIncrementalTables(final Map<String, TableInfo<CommonField<DataType>>> tableNameToTable, final ConfiguredAirbyteCatalog catalog) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm mistaken, shouldn't this method have throws InvalidCursorException
at the end to indicate that it could throw an Exception?
Another note is can we have a short javadoc comment that says something along the lines of
/**
* Creates a list of incremental tables with invalid cursor columns (e.g. non-numeric types). Will also throw
* `InvalidCursorException` if at least one table includes an invalid cursor type
*/
EDIT: After chatting with Ed on this, since InvalidCursorException
extends RuntimeException
you don't need to define this in the method, preference would still be to have this defined either as a javadoc comment or within the method to just know in case the future someone wants to catch this Exception
@Override | ||
public boolean isValidCursorType(final MysqlType cursorType) { | ||
return switch (cursorType) { | ||
case BIT, BOOLEAN, TINYINT, TINYINT_UNSIGNED, SMALLINT, SMALLINT_UNSIGNED, MEDIUMINT, MEDIUMINT_UNSIGNED, INT, INT_UNSIGNED, BIGINT, BIGINT_UNSIGNED, FLOAT, FLOAT_UNSIGNED, DOUBLE, DOUBLE_UNSIGNED, DECIMAL, DECIMAL_UNSIGNED, DATE, DATETIME, TIMESTAMP, TIME, YEAR, CHAR, VARCHAR, TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT, ENUM, SET, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB, BINARY, VARBINARY -> true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a large case
, would like to see if this can simply be reordered lexicographically to more easily parse out the options
EDIT: chatting with Ed on this I'm not hard set on lexicographically sorting since it can also be easier for people that look at the data as grouped sets of values, may like to still consider how "readable" this would be for someone looking to understand which types are supported though
@Override | ||
public boolean isValidCursorType(final JDBCType cursorType) { | ||
return switch (cursorType) { | ||
case TIMESTAMP, TIME, DATE, BIT, BOOLEAN, TINYINT, SMALLINT, INTEGER, BIGINT, FLOAT, DOUBLE, REAL, NUMERIC, DECIMAL, CHAR, NCHAR, NVARCHAR, VARCHAR, LONGVARCHAR, BINARY, BLOB -> true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as previous comment, would like to see if this can be reordered lexicographically to more easily know if a cursor type is supported
@Override | ||
public boolean isValidCursorType(final MysqlType cursorType) { | ||
return switch (cursorType) { | ||
case BIT, BOOLEAN, TINYINT, TINYINT_UNSIGNED, SMALLINT, SMALLINT_UNSIGNED, MEDIUMINT, MEDIUMINT_UNSIGNED, INT, INT_UNSIGNED, BIGINT, BIGINT_UNSIGNED, FLOAT, FLOAT_UNSIGNED, DOUBLE, DOUBLE_UNSIGNED, DECIMAL, DECIMAL_UNSIGNED, DATE, DATETIME, TIMESTAMP, TIME, YEAR, CHAR, VARCHAR, TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT, ENUM, SET, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB, BINARY, VARBINARY -> true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: reorder values lexicographically for easier parsing and also to have the
// since cursor are expected to be comparable, ...
comment in the abstractDbSource
method since that's the top-level where these methods override from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General question regarding whether validateCursorFieldForIncrmentalTables
should include throws InvalidCursorException
since the last part of the logic will throw one when an incremental table with invalid cursor exists
Minor nitpick items on the case values being reordered to be more easily parsable and move the comment for supported cursor types to the top-level abstractDbSource
class and JdbcCompatibleSourceOperations
class
# Conflicts: # airbyte-db/db-lib/src/main/java/io/airbyte/db/JdbcCompatibleSourceOperations.java
NOTE
|
NOTE
|
/test connector=connectors/source-postgres
Build FailedTest summary info:
|
/test connector=connectors/source-mysql
Build PassedTest summary info:
|
/test connector=connectors/source-postgres |
NOTE
|
NOTE
|
/test connector=connectors/source-postgres
Build PassedTest summary info:
|
/test connector=connectors/source-mysql
Build PassedTest summary info:
|
/test connector=connectors/source-postgres
Build PassedTest summary info:
|
/test connector=connectors/source-mysql
Build PassedTest summary info:
|
NOTE
|
…ytehq#15317) * implement validation for cursor type before reading data * rename class * add test * fix merge conflicts * address review comments * fix test
…ytehq#15317) * implement validation for cursor type before reading data * rename class * add test * fix merge conflicts * address review comments * fix test
What
How
Describe the solution
Recommended reading order
x.java
y.python
🚨 User Impact 🚨
Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changesTests
Unit
Put your unit tests output here.
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.