-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persistence Component - Big query support #1904
Conversation
|
…hemes. Added tests for Big Query
a2ae98d
to
d5ca20a
Compare
…hemes. Added tests for Big Query
/easycla |
…y_support # Conflicts: # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/pom.xml # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/main/java/org/finos/legend/engine/persistence/components/relational/bigquery/BigQuerySink.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/main/java/org/finos/legend/engine/persistence/components/relational/bigquery/executor/BigQueryExecutor.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/main/java/org/finos/legend/engine/persistence/components/relational/bigquery/executor/BigQueryHelper.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/main/java/org/finos/legend/engine/persistence/components/relational/bigquery/executor/BigQueryTransactionManager.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/main/java/org/finos/legend/engine/persistence/components/relational/bigquery/optmizer/StringCaseOptimizer.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/main/java/org/finos/legend/engine/persistence/components/relational/bigquery/sql/visitor/DeleteVisitor.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/main/java/org/finos/legend/engine/persistence/components/relational/bigquery/sql/visitor/FieldVisitor.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/main/java/org/finos/legend/engine/persistence/components/relational/bigquery/sqldom/schemaops/statements/AlterTable.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/test/java/org/finos/legend/engine/persistence/components/BaseTestUtils.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/test/java/org/finos/legend/engine/persistence/components/e2e/BigQueryEndToEndTest.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/test/java/org/finos/legend/engine/persistence/components/ingestmode/BigQueryTestArtifacts.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/test/java/org/finos/legend/engine/persistence/components/ingestmode/BitemporalDeltaSourceSpecifiesFromTest.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/test/java/org/finos/legend/engine/persistence/components/ingestmode/NontemporalDeltaTest.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/test/java/org/finos/legend/engine/persistence/components/logicalplan/operations/AlterTest.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/test/resources/expected/append/data_pass2.csv # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-bigquery/src/test/resources/expected/nontemporal_delta/data_pass2.csv # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-core/src/main/java/org/finos/legend/engine/persistence/components/relational/RelationalSink.java # legend-engine-xt-persistence-component/legend-engine-xt-persistence-component-relational-core/src/main/java/org/finos/legend/engine/persistence/components/relational/sqldom/common/Clause.java
|
||
|
||
@Override | ||
public void close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this empty block intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, intentional, JDBC Helper actually has a JDBC connection that needs to be closed explicitly, but Big Query doesn't need to close anything, that's why this is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wont be better to have closeTransactionManager
as private, and inside this close call that method? That way, you have a single close that cleans all resources, and in the future other resources might be added too...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed as suggested, But keeping closeTransactionManager as public as it's inherited from RelationalExecutionHelper.
implicitDataTypeMapping.put(DataType.INTEGER, new HashSet<>(Arrays.asList(DataType.INT, DataType.BIGINT, DataType.TINYINT, DataType.SMALLINT, DataType.INT64))); | ||
implicitDataTypeMapping.put(DataType.NUMERIC, new HashSet<>(Arrays.asList(DataType.NUMERIC, DataType.NUMBER, DataType.DECIMAL, DataType.INT, DataType.INTEGER, DataType.BIGINT, DataType.TINYINT, DataType.SMALLINT, DataType.INT64))); | ||
implicitDataTypeMapping.put(DataType.FLOAT, new HashSet<>(Arrays.asList(DataType.REAL, DataType.DOUBLE, DataType.FLOAT64, DataType.INT, DataType.INTEGER, DataType.BIGINT, DataType.TINYINT, DataType.SMALLINT, DataType.INT64, DataType.NUMBER, DataType.NUMERIC, DataType.DECIMAL))); | ||
implicitDataTypeMapping.put(DataType.STRING, new HashSet<>(Arrays.asList(DataType.STRING, DataType.CHAR, DataType.CHARACTER, DataType.VARCHAR, DataType.LONGNVARCHAR, DataType.LONGTEXT, DataType.TEXT))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These HashSet should be wrappd on unmodifiable to actually make the IMPLICIT_DATA_TYPE_MAPPING fully unmodifiable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Changed as suggested
explicitDataTypeMapping.put(DataType.NUMBER, new HashSet<>(Arrays.asList(DataType.REAL, DataType.FLOAT, DataType.DOUBLE, DataType.FLOAT64))); | ||
explicitDataTypeMapping.put(DataType.NUMERIC, new HashSet<>(Arrays.asList(DataType.REAL, DataType.FLOAT, DataType.DOUBLE, DataType.FLOAT64))); | ||
explicitDataTypeMapping.put(DataType.DECIMAL, new HashSet<>(Arrays.asList(DataType.REAL, DataType.FLOAT, DataType.DOUBLE, DataType.FLOAT64))); | ||
EXPLICIT_DATA_TYPE_MAPPING = Collections.unmodifiableMap(explicitDataTypeMapping); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above - These HashSet should be wrappd on unmodifiable to actually make the IMPLICIT_DATA_TYPE_MAPPING fully unmodifiable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Changed as suggested
|
||
|
||
@Override | ||
public void close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wont be better to have closeTransactionManager
as private, and inside this close call that method? That way, you have a single close that cleans all resources, and in the future other resources might be added too...
<dependency> | ||
<groupId>com.google.cloud</groupId> | ||
<artifactId>google-cloud-bigquery</artifactId> | ||
<scope>provided</scope> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How these provided dependencies will work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's a library, the user of the library is supposed to provide the dependencies.
...os/legend/engine/persistence/components/relational/bigquery/optmizer/LowerCaseOptimizer.java
Outdated
Show resolved
Hide resolved
...os/legend/engine/persistence/components/relational/bigquery/optmizer/UpperCaseOptimizer.java
Outdated
Show resolved
Hide resolved
case TINYINT: | ||
case SMALLINT: | ||
case INT64: | ||
dataType = new Int64(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the rational to fit everything on 64 bit? Seems as something that will lead to incorrect data.
Im not sure how these mappings are used, but if we allow 64 bit on some places that then we try to insert into the database into TinyInt, we might end with bit overflows, and we might not know the value on the DB will be wrong.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mapping is for mapping logical data types to BigQuery data types.
In Big Query, INT, SMALLINT, INTEGER, BIGINT, TINYINT, and BYTEINT are aliases for INT64.
So we have mapped all of them to INT64.
https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#numeric_types
return FieldType.builder().dataType(DataType.NUMERIC).length(columnSize).scale(decimalDigits).build(); | ||
case FLOAT64: | ||
case FLOAT: | ||
return FieldType.builder().dataType(DataType.FLOAT).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the same line as prev comment on INT64, FLOAT and FLAOT 64 are two different types that have different ranges of values they can store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BigQuery only supports FLOAT64, so we have mapped both these logical types to FLOAT64
What type of PR is this?
What does this PR do / why is it needed ?
Which issue(s) this PR fixes:
Fixes #
Other notes for reviewers:
Does this PR introduce a user-facing change?
NO