Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data type macros #380

Merged
merged 4 commits into from
Jul 6, 2022
Merged

Data type macros #380

merged 4 commits into from
Jul 6, 2022

Conversation

jtcohen6
Copy link
Contributor

@cla-bot cla-bot bot added the cla:yes label Jun 30, 2022
@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Jun 30, 2022

TestTypeInt and TestTypeFloat were failing because we infer/load numeric columns with decimals as double and without decimals as bigint.

@classmethod
def convert_number_type(cls, agate_table, col_idx):
decimals = agate_table.aggregate(agate.MaxPrecision(col_idx))
return "double" if decimals else "bigint"

Whereas the type macros are returning float and int, respectively. All the more reason to reconcile between these convert_{X}_type methods (for agate use), and the values returned by Column.translate_type (dbt-labs/dbt-core#5317). We could accomplish that within this adapter, by setting up type translation for SparkColumn from intbigint and floatdouble.

For now, skip inference and just set the types we want in each seed.

@jtcohen6 jtcohen6 marked this pull request as ready for review June 30, 2022 18:04
@jtcohen6 jtcohen6 requested a review from dbeatty10 June 30, 2022 18:04
Copy link
Contributor

@dbeatty10 dbeatty10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +18 to +19
# need to explicitly cast this to avoid it being inferred/loaded as a DOUBLE on Spark
# in SparkSQL, the two are equivalent for `=` comparison, but distinct for EXCEPT comparison
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is unfortunate behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — the distinction between = and EXCEPT doesn't make much sense to me, but I'm sure there is a reason (or was, at some point)

@dbeatty10 dbeatty10 merged commit f284cde into main Jul 6, 2022
@dbeatty10 dbeatty10 deleted the jerco/data-type-macros branch July 6, 2022 11:42
ueshin added a commit to databricks/dbt-databricks that referenced this pull request Jul 18, 2022
### Description

Ports tests in the upstream: Data type macros (dbt-labs/dbt-spark#380)

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
ueshin added a commit to databricks/dbt-databricks that referenced this pull request Jul 18, 2022
### Description

Ports tests in the upstream: Data type macros (dbt-labs/dbt-spark#380)

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
francescomucio pushed a commit to francescomucio/dbt-spark that referenced this pull request Jul 26, 2022
* Run tests for data type macros. Fine tune numeric_type

* Hard code seed loading types for float + int

* Repoint, fixup, changelog entry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants