-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-10186][SQL][WIP] Array types using JDBCRDD and postgres #9137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
617b1cb to
4ebe7d6
Compare
…tgres This change allows reading from jdbc array column types for the postgresql dialect. This also opens up some implementation for array types using other jdbc backends.
617b1cb to
72beea6
Compare
|
Still need to add some additional types from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed some underscores here?
|
very cool! nice refactoring too! thanks for picking this up, i haven't had time to revisit. would you mind adding 'uuid' as |
Added array type writes
|
Sure. Had to refactor a little to work around type erasure warnings |
|
I'll add tests once [SPARK-9818] #8101 is merged in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ISTM we need to check if input types are valid for target databases in advance, e.g., in JavaUtils#saveTable.
JavaUtils#savePartition should simply put input data as given typed-data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the particular dialect does not support these types saveTable should toss an exception when building the nullTypes array
|
Great work! I left some review comments. |
|
I also need to rebase this thing against master again it seems |
|
Is the best approach to rebase or just merge master into this and resolve conflicts? |
|
@JoshRosen Guess its refactor time due to SPARK-11541. Makes it rather hard if we ever want to add support in 1.5X. |
Conflicts: sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
|
We just merged #9503, so it should now be possible to add integration tests which run against a real Postgres database. Take a look at my PR for examples of where to add these tests. |
Add ARRAY support to `PostgresDialect`. Nested ARRAY is not allowed for now because it's hard to get the array dimension info. See http://stackoverflow.com/questions/16619113/how-to-get-array-base-type-in-postgres-via-jdbc Thanks for the initial work from mariusvniekerk ! Close #9137 Author: Wenchen Fan <wenchen@databricks.com> Closes #9662 from cloud-fan/postgre. (cherry picked from commit d925149) Signed-off-by: Michael Armbrust <michael@databricks.com>
Add ARRAY support to `PostgresDialect`. Nested ARRAY is not allowed for now because it's hard to get the array dimension info. See http://stackoverflow.com/questions/16619113/how-to-get-array-base-type-in-postgres-via-jdbc Thanks for the initial work from mariusvniekerk ! Close apache/spark#9137 Author: Wenchen Fan <wenchen@databricks.com> Closes #9662 from cloud-fan/postgre.
This change allows reading from jdbc array column types for the postgresql dialect.
This also opens up some implementation for array types using other jdbc backends.
@JoshRosen @marmbrus