-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add REST Catalog tests to Spark 3.5 integration test #11093
Add REST Catalog tests to Spark 3.5 integration test #11093
Conversation
9481bd4
to
8bcd853
Compare
...5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java
Outdated
Show resolved
Hide resolved
@@ -521,7 +524,7 @@ public void testFilesTableTimeTravelWithSchemaEvolution() throws Exception { | |||
optional(3, "category", Types.StringType.get()))); | |||
|
|||
spark.createDataFrame(newRecords, newSparkSchema).coalesce(1).writeTo(tableName).append(); | |||
|
|||
table.refresh(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this refresh only needed for REST?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, RESTTableOperations and other TableOperations has different mechanisms of refreshing metadata.
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/TestAlterTable.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java
Outdated
Show resolved
Hide resolved
// JdbcCatalog, then different jdbc connections could provide different views of table | ||
// status even belonging to the same catalog. Reference: | ||
// https://www.sqlite.org/inmemorydb.html | ||
System.setProperty(CatalogProperties.CLIENT_POOL_SIZE, "1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this strictly needed to make tests pass? I don't think we set this in any other tests for that specific purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is needed for test to pass. I attempted to run test without this line and here're the result: https://github.com/apache/iceberg/actions/runs/11297060489/job/31423193928?pr=11093
...5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/TestAlterTable.java
Outdated
Show resolved
Hide resolved
...5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java
Outdated
Show resolved
Hide resolved
System.setProperty(CatalogProperties.CLIENT_POOL_SIZE, "1"); | ||
restServer.start(false); | ||
restCatalog = RCKUtils.initCatalogClient(); | ||
System.clearProperty("rest.port"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should pass a Map<String, String> to the REST server rather than having to set system properties (which then also need to be cleared again). @danielcweeks thoughts on this?
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java
Outdated
Show resolved
Hide resolved
@BeforeEach | ||
public void before() { | ||
this.validationCatalog = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if we don't do any changes to how the validation catalog is configured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like you created a table on REST catalog, while validate against Hive catalog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my point is that I think all tests should be passing when we don't do any changes to the validation catalog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, the original validationCatalog
is set to a HadoopCatalog
if catalogName
of the catalog being tested is named testhadoop
; otherwise, make the validationCatalog
the same as catalog
(which is strictly a HiveCatalog
, as defined by TestBase
class).
That will work in the old days, as we only have 2 types catalogs, either Hadoop or Hive, being tested - if the test subject is not a Hadoop Catalog, then setting the validation catalog to Hive Catalog will suffice the validation purpose. However, with the introduction of a 3rd type, REST catalog: when the test subject is a REST catalog, without changing how validationCatalog
is initialized, the validationCatalog
will be set to a Hive catalog. In this case, conducting test behaviors on REST catalog while validating the status post-change on Hive catalog won't work.
Unless, you are suggesting that we should make changes to TestBase
class where the catalog
being tested does not strictly need to be a HiveCatalog, and can be any type of catalog.
Map<String, String> catalogProperties = RCKUtils.environmentCatalogConfig(); | ||
Map<String, String> catalogProperties = Maps.newHashMap(); | ||
catalogProperties.putAll(RCKUtils.environmentCatalogConfig()); | ||
catalogProperties.putAll(Maps.fromProperties(System.getProperties())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned in a comment further above, maybe we should consider passing a Map<String, String> to RESTCatalogServer
rather than relying on system properties (which have to be cleared after they were configured)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that means we can revert this line here: catalogProperties.putAll(Maps.fromProperties(System.getProperties()));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted.
// JdbcCatalog, then different jdbc connections could provide different views of table | ||
// status even belonging to the same catalog. Reference: | ||
// https://www.sqlite.org/inmemorydb.html | ||
System.setProperty(CatalogProperties.CLIENT_POOL_SIZE, "1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was this conversation getting lost (not sure why): #11093 (comment)
System.setProperty(CatalogProperties.CLIENT_POOL_SIZE, "1"); | ||
restServer.start(false); | ||
restCatalog = RCKUtils.initCatalogClient(); | ||
System.clearProperty("rest.port"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one was also lost: #11093 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in #11093 (comment) I think we should do this differently rather than passing/re-setting system properties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java
Outdated
Show resolved
Hide resolved
open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java
Outdated
Show resolved
Hide resolved
open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java
Outdated
Show resolved
Hide resolved
open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java
Outdated
Show resolved
Hide resolved
…alogServer.java Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
…alogServer.java Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
…seWithCatalog.java Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
…seWithCatalog.java
09f66ed
to
2d31dc0
Compare
@nastra, @danielcweeks is there something stopping us from merging this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @haizhou-zhao for getting this done!
* Add REST Catalog tests to Spark 3.5 integration test Add REST Catalog tests to Spark 3.4 integration test tmp save Fix integ tests Revert "Add REST Catalog tests to Spark 3.4 integration test" This reverts commit d052416. unneeded changes fix test retrigger checks Fix integ test Fix port already in use Fix unmatched validation catalog spotless Fix sqlite related test failures * Rebase & spotless * code format * unneeded change * unneeded change * Revert "unneeded change" This reverts commit ae29c41. * code format * Use in-mem config to configure RCK * Update open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java * Use RESTServerExtension * check style and test failure * test failure * fix test * fix test * spotless * Update open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com> * Update open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com> * Update spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com> * Spotless and fix test * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Update spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java * Package protected RCKUtils * spotless * unintentional change * remove warehouse specification from rest * spotless * move find free port to rest server extension * fix typo * checkstyle * fix unit test --------- Co-authored-by: Haizhou Zhao <haizhouzhao@Haizhous-MacBook-Pro.local> Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
For issue: #11079