Skip to content

Conversation

@mariusvniekerk
Copy link
Member

This change allows reading from jdbc array column types for the postgresql dialect.

This also opens up some implementation for array types using other jdbc backends.

@JoshRosen @marmbrus

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

…tgres

This change allows reading from jdbc array column types for the postgresql dialect.

This also opens up some implementation for array types using other jdbc backends.
@mariusvniekerk mariusvniekerk force-pushed the SPARK-10186 branch 2 times, most recently from 617b1cb to 72beea6 Compare October 15, 2015 18:12
@mariusvniekerk
Copy link
Member Author

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed some underscores here?

@lepfhty
Copy link

lepfhty commented Oct 21, 2015

very cool! nice refactoring too! thanks for picking this up, i haven't had time to revisit.

would you mind adding 'uuid' as Some(StringType) to the PostgresDialect#getCatalystType method? it's probably a one-liner.

@mariusvniekerk
Copy link
Member Author

Sure. Had to refactor a little to work around type erasure warnings

@mariusvniekerk
Copy link
Member Author

I'll add tests once [SPARK-9818] #8101 is merged in

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISTM we need to check if input types are valid for target databases in advance, e.g., in JavaUtils#saveTable.
JavaUtils#savePartition should simply put input data as given typed-data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the particular dialect does not support these types saveTable should toss an exception when building the nullTypes array

@maropu
Copy link
Member

maropu commented Oct 29, 2015

Great work! I left some review comments.
Since ARRAY is one of the SQL99 standard, it'd be better to put this type mapping in JavaUtils#schemaString.
Yes, I know that some databases, e.g, mysql, have no support for ARRAY, so these databases need to throw an exception for unsupported types in JdbcDialect#getJDBCType.

@mariusvniekerk
Copy link
Member Author

I also need to rebase this thing against master again it seems

@mariusvniekerk
Copy link
Member Author

Is the best approach to rebase or just merge master into this and resolve conflicts?

@mariusvniekerk
Copy link
Member Author

@JoshRosen Guess its refactor time due to SPARK-11541. Makes it rather hard if we ever want to add support in 1.5X.

@mariusvniekerk mariusvniekerk changed the title [SPARK-10186][SQL] Array types using JDBCRDD and postgres [SPARK-10186][SQL][WIP] Array types using JDBCRDD and postgres Nov 6, 2015
Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
@JoshRosen
Copy link
Contributor

We just merged #9503, so it should now be possible to add integration tests which run against a real Postgres database. Take a look at my PR for examples of where to add these tests.

asfgit pushed a commit that referenced this pull request Nov 17, 2015
Add ARRAY support to `PostgresDialect`.

Nested ARRAY is not allowed for now because it's hard to get the array dimension info. See http://stackoverflow.com/questions/16619113/how-to-get-array-base-type-in-postgres-via-jdbc

Thanks for the initial work from mariusvniekerk !

Close #9137

Author: Wenchen Fan <wenchen@databricks.com>

Closes #9662 from cloud-fan/postgre.

(cherry picked from commit d925149)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in d925149 Nov 17, 2015
kiszk pushed a commit to kiszk/spark-gpu that referenced this pull request Dec 26, 2015
Add ARRAY support to `PostgresDialect`.

Nested ARRAY is not allowed for now because it's hard to get the array dimension info. See http://stackoverflow.com/questions/16619113/how-to-get-array-base-type-in-postgres-via-jdbc

Thanks for the initial work from mariusvniekerk !

Close apache/spark#9137

Author: Wenchen Fan <wenchen@databricks.com>

Closes #9662 from cloud-fan/postgre.
@mariusvniekerk mariusvniekerk deleted the SPARK-10186 branch December 10, 2016 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants