-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#4514] improvement(iceberg): shrink catalog-lakehouse-iceberg libs size from 184MB to 56MB, IcebergRESTServer size from 162M to 70M #4548
Conversation
@FANNG1 can you please help to review this? |
@LiuQhahah thanks for the PR, I compile the package
|
implementation(project(":core")) { | ||
exclude("*") | ||
} | ||
implementation(project(":iceberg:iceberg-common")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some jars in :iceberg:iceberg-common
are critical, we couldn't exclude all the dependences, we could exclude extra jars in the build file of :iceberg:iceberg-common
@FANNG1 @LiuQhahah can we please move fast on this, this is an important PR, @FANNG1 can you please help @LiuQhahah on this to make it work soon. |
@LiuQhahah is there anything blocking you? |
I have updated the code. |
You exclude all dependences from
|
Hi @FANNG1 I try to update the file. thanks
|
could I continue exclude some jars based on your PR? |
f2c401e
to
fb6ebe7
Compare
|
|
The following jars can be exclude also:
|
Iceberg REST server could run on standalone mode, couldn't reuse the jars from Gravitino server, so the two jars seems necessary.
There are runtime dependences for the MapReduce client.
|
@jerryshao @LiuQhahah @yuqi1129 @caican00 please help to review again |
config.put("catalog.jdbc_backend.warehouse", "/tmp/usr/jdbc/warehouse"); | ||
config.put("catalog.jdbc_backend.jdbc.password", "gravitino"); | ||
config.put("catalog.jdbc_backend.jdbc.user", "gravitino"); | ||
config.put("catalog.jdbc_backend.jdbc-driver", "org.h2.Driver"); | ||
config.put("catalog.jdbc_backend.jdbc-driver", "org.sqlite.JDBC"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we use Sqlite instead of H2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no special reason, it has sqlite dependence
Can you please list all the jars for the Iceberg catalog and Iceberg rest service after you excluded jars? Besides, I think #4696 is also addressed here in this PR, can you please update the PR title and description? |
exclude(group = "org.junit.jupiter") | ||
} | ||
testImplementation(libs.bundles.jersey) | ||
testImplementation(libs.bundles.jetty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need jetty and jersey here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
exclude("org.mortbay.jetty") | ||
} | ||
// use hdfs-default.xml | ||
implementation(libs.hadoop2.hdfs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this, is hdfs-client enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I failed to exclude hadoop-hdfs
jar because we dependshdfs-default.xml
to initialize the related environment.
|
|
ffd5c83
to
0f05dd3
Compare
@@ -138,6 +138,7 @@ hive2-common = { group = "org.apache.hive", name = "hive-common", version.ref = | |||
hive2-jdbc = { group = "org.apache.hive", name = "hive-jdbc", version.ref = "hive2"} | |||
hadoop2-auth = { group = "org.apache.hadoop", name = "hadoop-auth", version.ref = "hadoop2" } | |||
hadoop2-hdfs = { group = "org.apache.hadoop", name = "hadoop-hdfs", version.ref = "hadoop2" } | |||
hadoop2-hdfs-client = { group = "org.apache.hadoop", name = "hadoop-hdfs-client", version.ref = "hadoop2" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you update the license.bin also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license.bin contains Apache Hadoop HDFS Client
already, but missing Apache Hadoop HDFS
, updated
@@ -83,6 +105,7 @@ dependencies { | |||
annotationProcessor(libs.lombok) | |||
compileOnly(libs.lombok) | |||
|
|||
testImplementation(project(":server-common")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we require "server-common" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because IcebergConfig overwites the default HTTP port
, we add a test to check whether it works in JettyServerConfig
, like:
public void testIcebergHttpPort() {
Map<String, String> properties = ImmutableMap.of();
IcebergConfig icebergConfig = new IcebergConfig(properties);
JettyServerConfig jettyServerConfig = JettyServerConfig.fromConfig(icebergConfig);
Assertions.assertEquals(
JettyServerConfig.DEFAULT_ICEBERG_REST_SERVICE_HTTP_PORT, jettyServerConfig.getHttpPort());
Assertions.assertEquals(
JettyServerConfig.DEFAULT_ICEBERG_REST_SERVICE_HTTPS_PORT,
jettyServerConfig.getHttpsPort());
4cc8f34
to
220a4d6
Compare
@FANNG1 can you make the CI pass? |
220a4d6
to
8259c85
Compare
What changes were proposed in this pull request?
remove some unnecessary dependencies for Iceberg REST server and Iceberg catalog
Why are the changes needed?
Fix: #4514 #4696
Does this PR introduce any user-facing change?
No
How was this patch tested?
CI passed