-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data misplaced when reading a table that does not have the same field positions as the spark schema #367
Comments
It seems that the data has been written correctly into the table. What happens when you read the table without specifying the schema? Also notice that the order of the |
I purposely changed the order of the attributes to reproduce the problem i had on our project: the connector does not use the field name to assign the values but their positions. Without specifying schema we get this:
I updated the question to give more precision |
If the schema here is a nested struct, then I think we have a gap in the conversion logic:
only rearranges top-level columns. I'm not sure if the right metadata is actually passed down by spark to correct this problem though |
We indeed have an issue with nested structs |
… the column order of struct variables need not be same as that of BQ schema
Fixed by PR #391 |
I am trying to create a dataset from bigquery table. The table has the same fields as case class but not in the same order.
When creating the dataset, we get columns mapped to the wrong fields.
Given this table:
When loading dataset
NB: notice that NestedClass has the same fields as table but in different order: (int3, int1, int2) instead of (int1, int2, int3)
we got this:
We expect to get this
=> The connector does not use fields name to assign values but correlates field position in case class with the same position in table.
The text was updated successfully, but these errors were encountered: