Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive: Refactor TestHiveCatalog tests to use the core CatalogTests #8918

Merged
merged 1 commit into from
Dec 14, 2023

Conversation

nk1506
Copy link
Contributor

@nk1506 nk1506 commented Oct 25, 2023

Currently HiveTests are not using the Core catalog tests. Ideally unlike other catalog Nessie , JDBC, Hive should also extent the core CatalogTests.
Depends upon feedback other duplicate tests will be removed and remaining will be moved to this new class.

@@ -261,6 +261,12 @@ public void renameTable(TableIdentifier from, TableIdentifier originalTo) {
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException("Interrupted in call to rename", e);
} catch (RuntimeException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you paste the complete callstack? I am wondering why it is not coming as AlreadyExistsException in line 254

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java.lang.AssertionError: Expecting actual throwable to be an instance of: org.apache.iceberg.exceptions.AlreadyExistsException but was: java.lang.RuntimeException: InvalidOperationException(message:new table newdb.table_renamed already exists) at org.apache.iceberg.relocated.com.google.common.base.Throwables.propagate(Throwables.java:231) at org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:75) at org.apache.iceberg.hive.MetastoreUtil.alterTable(MetastoreUtil.java:75)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this change is needed now because CatalogTests is expecting that particular type of message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a quick look, I do not like this change.

Why is this RuntimeException?

This code tries to capture this case:

    } catch (AlreadyExistsException e) {
      throw new org.apache.iceberg.exceptions.AlreadyExistsException(
          "Table already exists: %s", to);
    }

Do we have checks for the other cases as well?
My first guess would be that MetastoreUtil.alterTable messes up the Hive exceptions, and maybe all of them are off?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe the exceptions thrown are different with different Hive versions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pvary @nk1506 what's the outcome of this? Are we ok with the changes in HiveCatalog or not?
This PR is required for #8907

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my side:

  • I am not sure Remove redundant error propagation check.  #9143 is generally ok - maybe there were some reason behind the RuntimeException, which I am not aware of.
  • I prefer not to have to unwrap the RuntimeException in exception handling - We need to understand why it is needed, and remove the root cause

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pvary @nastra is it a good idea for now to override the corresponding test cases at TestHiveCatalog. and remove this specific RuntimeException handling ?
Currently there is only one test cases which is failing testRenameTableDestinationTableAlreadyExists.
It is failing here.

With a backlog issue we can fix this RuntimeException issue.
WDYT ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok ignoring testRenameTableDestinationTableAlreadyExists() for TestHiveCatalog for now and revert the changes here and then open an issue to follow-up on improving this behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created an issue to track the same. @nastra , Please assign this issue to me. Meantime, I will look into this.

@nk1506 nk1506 force-pushed the hive_tests_refactor branch from 34be422 to e9668f8 Compare October 26, 2023 08:51
@nk1506 nk1506 changed the title Hive: Refactor HiveCatalog tests to use the core CatalogTests Hive: Refactor TestHiveCatalog tests to use the core CatalogTests Oct 26, 2023
@nk1506 nk1506 marked this pull request as ready for review October 26, 2023 08:52
@@ -261,6 +261,12 @@ public void renameTable(TableIdentifier from, TableIdentifier originalTo) {
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException("Interrupted in call to rename", e);
} catch (RuntimeException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this change is needed now because CatalogTests is expecting that particular type of message?

/**
* Run all the tests from abstract of {@link CatalogTests}. Also, a few specific tests for HIVE too.
* There could be some duplicated tests that are already being covered with {@link CatalogTests}
* //TODO: remove duplicate tests with {@link CatalogTests}.Also use the DB/TABLE/SCHEMA from {@link
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please check and remove in this PR only? I don't think it should be a separate change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another patch in progress which will do these cleanup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

independently from the other PR we can remove the entire comment block here as I don't think we need it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nk1506 can you please go ahead and remove this entire comment block?

String location = temp.resolve("tbl").toString();

try {
Table table =
catalog
catalog()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you keep a local variable HiveCatalog catalog and initialize it in before(), you don't have to change to a method call everywhere;

@nk1506 nk1506 force-pushed the hive_tests_refactor branch from e9668f8 to d7456e9 Compare October 29, 2023 05:30
@@ -859,7 +862,7 @@ private void removeNamespaceOwnershipAndVerify(
}

@Test
public void testDropNamespace() throws TException {
public void testDropNamespace() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's overriding with Base class testDropNamespace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to rename the method rather than override it, since it's testing custom logic here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per previous comment , I reverted the renaming.

@nk1506 nk1506 force-pushed the hive_tests_refactor branch from d7456e9 to c9b0078 Compare October 29, 2023 05:45
@nk1506 nk1506 force-pushed the hive_tests_refactor branch 4 times, most recently from 530933f to 0c54560 Compare October 31, 2023 07:25
// in case of table already exists,
// Hive rename operation throws exception as
// java.lang.RuntimeException:InvalidOperationException(message:new table <> already exists)
if (e.getMessage().contains(String.format("new table %s already exists)", to))) {
Copy link
Contributor

@nastra nastra Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we should examine the exact cause rather than just looking at the exception message.

/*
* This meta-setup has been deprecated; use {@link HiveMetastoreExtension} instead.
* */
@Deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this could you please create a PR to remove this class?
Or in this one, if it is trivial...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pvary for reviewing this . Yes we have another change is in progress with all the tests related cleanup.
WIP patch

Copy link
Contributor

@nastra nastra Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nk1506 can you please open a proper PR against the Iceberg repo with those changes where subclasses of HiveMetastoreTest are moved to JUnit5?
That being said, I don't think the deprecation here is necessary, since HiveMetastoreTest could also be seen as a Junit4-thing, whereas the HiveMetastoreExtension is what you would use for JUnit5.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a PR with jUnit5.

@nk1506 nk1506 force-pushed the hive_tests_refactor branch 2 times, most recently from 5fdb2f5 to 13e4a4f Compare November 25, 2023 02:42
@nk1506 nk1506 force-pushed the hive_tests_refactor branch 2 times, most recently from 59d89a4 to 768d7c6 Compare December 12, 2023 18:15
Copy link
Member

@ajantha-bhat ajantha-bhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have a less disruptive change by having a new constructor for HiveMetastoreExtension that doesn't take DB_NAME and skips DB creation if it is not set.

Only use the new constructor for TestHiveCatalog and manually recreate DB in beforeEach()

@nk1506
Copy link
Contributor Author

nk1506 commented Dec 13, 2023

We can have a less disruptive change by having a new constructor for HiveMetastoreExtension that doesn't take DB_NAME and skips DB creation if it is not set.

Only use the new constructor for TestHiveCatalog and manually recreate DB in beforeEach()

I was in the impression to make HiveMetastoreExtension specific to only metastore setup and teardown. All the clients of HiveMetastoreExtension should have responsibility to manage databases and tables.

With overloaded constructor it will have hybrid flavour.

Please share your thoughts.

@nastra
Copy link
Contributor

nastra commented Dec 13, 2023

I agree that creating the database inside the extension should be made configurable

@nastra
Copy link
Contributor

nastra commented Dec 13, 2023

@nk1506 @ajantha-bhat I've opened #9288 to make the extension configurable.


@ParameterizedTest
@ValueSource(ints = {1, 2})
public void testReplaceTxnBuilder(int formatVersion) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testReplaceTxnBuilder -> testReplaceTransaction

@@ -494,21 +356,6 @@ private void createNamespaceAndVerifyOwnership(
assertThat(db.getOwnerType()).isEqualTo(expectedOwnerType);
}

@Test
public void testListNamespace() throws TException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testListNamespace -> testListNamespaces .

@@ -535,30 +382,6 @@ public void testNamespaceExists() throws TException {
.isFalse();
}

@Test
public void testSetNamespaceProperties() throws TException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testSetNamespaceProperties -> testSetNamespaceProperties

@@ -740,27 +563,6 @@ private void setNamespaceOwnershipAndVerify(
assertThat(database.getOwnerType()).isEqualTo(expectedOwnerTypePostSet);
}

@Test
public void testRemoveNamespaceProperties() throws TException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testRemoveNamespaceProperties -> testRemoveNamespaceProperties

@@ -1210,35 +1012,4 @@ public void testDatabaseLocationWithSlashInWarehouseDir() {

assertThat(database.getLocationUri()).isEqualTo("s3://bucket/database.db");
}

@Test
public void testRegisterTable() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testRegisterTable -> testRegisterTable

}

@Test
public void testRegisterExistingTable() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testRegisterExistingTable -> testRegisterExistingTable

@nk1506
Copy link
Contributor Author

nk1506 commented Dec 13, 2023

Thanks @ajantha-bhat and @nastra for recommending configurable approach for HiveMetastoreExtension and fixing this. Although I had to make a small fix to extend CatalogTests.

@nk1506 nk1506 force-pushed the hive_tests_refactor branch from 51a616b to 3f69333 Compare December 13, 2023 09:47
Copy link
Member

@ajantha-bhat ajantha-bhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just have to conclude #8918 (comment)

May be ask in #dev slack channel or in a mailing list.
We can work on it even as a follow up IMO.

rest of the change LGTM.

catalog.createNamespace(ns);
}

Assertions.assertThat(catalog.tableExists(TABLE))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a TODO and mention the issue that addresses this. Also it's not clear what exactly is different from the test in CatalogTests, so please add a comment

@nk1506 nk1506 force-pushed the hive_tests_refactor branch from e95be19 to 7174c7f Compare December 13, 2023 15:08
Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @pvary can you also take a final look on this please?

Assertions.assertThat(catalog.tableExists(renamedTable))
.as("Destination table should exist after create")
.isTrue();
// With fix of issues#9289,it should match with CatalogTests and expect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be good to add a newline before this line for better readability

@pvary
Copy link
Contributor

pvary commented Dec 14, 2023

Thanks for all the work you put into this @nk1506 and @nastra!

@nk1506 nk1506 force-pushed the hive_tests_refactor branch from 7174c7f to e9722db Compare December 14, 2023 14:57
@nastra nastra merged commit 5487c17 into apache:main Dec 14, 2023
41 checks passed
lisirrx pushed a commit to lisirrx/iceberg that referenced this pull request Jan 4, 2024
geruh pushed a commit to geruh/iceberg that referenced this pull request Jan 26, 2024
devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants