Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][misc] Unable to connect an etcd metastore with recent releases due to jetc-core sharding problem #23604

Merged
merged 3 commits into from
Nov 16, 2024

Conversation

Shawyeok
Copy link
Contributor

@Shawyeok Shawyeok commented Nov 15, 2024

Fixes #23513

With recent pulsar releases, it raise NoClassDefFoundError if setup with a etcd metadata.

docker run -it --rm apachepulsar/pulsar:4.0.0 bin/pulsar standalone --metadata-url etcd:http://a-etcd:2379
docker run -it --rm apachepulsar/pulsar:3.0.6 bin/pulsar standalone --metadata-url etcd:http://a-etcd:2379

Motivation

The jetcd-core-shaded module was introduced in #22892 to address the compatibility issues between jetcd-core’s grpc-java dependency and Netty. You can find more details here and in the grpc-java documentation.

Currently, we use unpack-shaded-jar execution unpacks the shaded jar produced by maven-shade-plugin:shade into the jetcd-core-shaded/target/classes directory. However, the classes in this directory conflict with its dependencies. If the maven-shade-plugin:shade runs again without cleaning this directory, it can produce an incorrect shaded jar. You can replicate and verify this issue with the following commands:

# Step 1: Clean the build directory
mvn clean

# Step 2: Perform an install and unpack the shaded jar into a directory.
# Verify the import statement for `io.netty.handler.logging.ByteBufFormat` in 
# `org/apache/pulsar/jetcd/shaded/io/vertx/core/net/NetClientOptions.class`. 
# The correct import should be: 
# `import io.grpc.netty.shaded.io.netty.handler.logging.ByteBufFormat;`.
mvn install
unzip $M2_REPO/org/apache/pulsar/jetcd-core-shaded/4.1.0-SNAPSHOT/jetcd-core-shaded-4.1.0-SNAPSHOT-shaded.jar \
  -d jetcd-core-shaded/target/first-classes

# Step 3: Run the install command again without cleaning.
# The unpacked jar from the previous step will persist in `jetcd-core-shaded/target/classes`. 
# Unpack the shaded jar into a different directory (e.g., second-classes) and check the import.
# The incorrect import will be: 
# `import io.grpc.netty.shaded.io.grpc.netty.shaded.io.netty.handler.logging.ByteBufFormat;`.
mvn install
unzip $M2_REPO/org/apache/pulsar/jetcd-core-shaded/4.1.0-SNAPSHOT/jetcd-core-shaded-4.1.0-SNAPSHOT-shaded.jar \
  -d jetcd-core-shaded/target/second-classes

# Step 4: Use IntelliJ IDEA's "Compare Directories" tool to compare the `first-classes` 
# and `second-classes` directories. The differences in imports should become apparent.

A simpler solution is to remove the configurations related to attach and unpack. IntelliJ IDEA assumes the shaded jar path is target/${artifactId}-${version}.jar. However, in Pulsar’s build system, the finalName is set to just ${artifactId} in the parent pom.xml. While I’m unsure of the reasoning behind this setup, we can override the finalName in jetcd-core-shaded/pom.xml. This is the approach I’ve taken in this patch.

Verifying this change

This issue typically cannot be detected by CI tests, as CI environments always run in a clean workspace. To address this, we could refine our release guidelines to include a step for cleaning the workspace before deploying artifacts. Additionally, incorporating automated checks into the release validation process could help catch such issues early.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: Shawyeok#18

jetcd-core-shaded/pom.xml Outdated Show resolved Hide resolved
@lhotari
Copy link
Member

lhotari commented Nov 15, 2024

A simpler solution is to remove the configurations related to attach and unpack. IntelliJ IDEA assumes the shaded jar path is target/${artifactId}-${version}.jar. However, in Pulsar’s build system, the finalName is set to just ${artifactId} in the parent pom.xml. While I’m unsure of the reasoning behind this setup, we can override the finalName in jetcd-core-shaded/pom.xml. This is the approach I’ve taken in this patch.

Thank you, @Shawyeok! Great work. How did you find out that IntelliJ expects this format? Could we remove overriding of finalName in pom.xml so that we'd use the default everywhere? It seems that the finalName was overridden already in the initial commit of Pulsar.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari
Copy link
Member

lhotari commented Nov 15, 2024

@Shawyeok Please submit a similar change to BookKeeper since there's a similar solution to shade jetcd-core in metadata-drivers/jetcd-core-shaded/pom.xml.

@Shawyeok
Copy link
Contributor Author

@lhotari

How did you find out that IntelliJ expects this format?

There is a silent error in the pulsar-metadata module dependencies, which I discovered by coincidence.

image_1731639655772_0

image

Could we remove overriding of finalName in pom.xml so that we'd use the default everywhere? It seems that the finalName was overridden already in the initial commit of Pulsar.

I tried once, but other configurations depend on the current finalName setting, such as:

<source>${basedir}/../../pulsar-functions/java-examples/target/pulsar-functions-api-examples.jar</source>

I wasn’t sure what was beneath the surface :-) , so I opted for a conservative approach.

@Shawyeok
Copy link
Contributor Author

@Shawyeok Please submit a similar change to BookKeeper since there's a similar solution to shade jetcd-core in metadata-drivers/jetcd-core-shaded/pom.xml.

Sure, will do it tomorrow.

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.35%. Comparing base (bbc6224) to head (8cb145a).
Report is 731 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23604      +/-   ##
============================================
+ Coverage     73.57%   74.35%   +0.78%     
- Complexity    32624    34443    +1819     
============================================
  Files          1877     1944      +67     
  Lines        139502   147127    +7625     
  Branches      15299    16225     +926     
============================================
+ Hits         102638   109398    +6760     
- Misses        28908    29293     +385     
- Partials       7956     8436     +480     
Flag Coverage Δ
inttests 27.57% <ø> (+2.98%) ⬆️
systests 24.40% <ø> (+0.08%) ⬆️
unittests 73.72% <ø> (+0.87%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 656 files with indirect coverage changes

@lhotari lhotari merged commit 89ccb73 into apache:master Nov 16, 2024
53 of 57 checks passed
hezhangjian pushed a commit to apache/bookkeeper that referenced this pull request Nov 18, 2024
### Motivation

There is a potential jar shading issue introduced in #4426 that causes `NoClassDefFoundError` when connecting to an etcd metadata store.

The `jetcd-core-shaded` module was introduced in #4426 to address the compatibility issues between jetcd-core’s grpc-java dependency and Netty. You can find more details [here][1] and in the [grpc-java documentation][2].

[1]: #4426 (comment)
[2]: https://github.com/grpc/grpc-java/blob/master/SECURITY.md#netty

Currently, we use `unpack-shaded-jar` execution unpacks the shaded jar produced by `maven-shade-plugin:shade` into the `jetcd-core-shaded/target/classes` directory. However, the classes in this directory conflict with its dependencies. If the `maven-shade-plugin:shade` runs again without cleaning this directory, it can produce an incorrect shaded jar. You can replicate and verify this issue with the following commands:
```shell
# Step 1: Clean the build directory
mvn clean

# Step 2: Perform an install and unpack the shaded jar into a directory.
# Verify the import statement for `io.netty.handler.logging.ByteBufFormat` in 
# `org/apache/pulsar/jetcd/shaded/io/vertx/core/net/NetClientOptions.class`. 
# The correct import should be: 
# `import io.grpc.netty.shaded.io.netty.handler.logging.ByteBufFormat;`.
mvn install
unzip $M2_REPO/org/apache/bookkeeper/metadata/drivers/jetcd-core-shaded/4.18.0-SNAPSHOT/jetcd-core-shaded-4.18.0-SNAPSHOT-shaded.jar \
  -d metadata-drivers/jetcd-core-shaded/target/first-classes

# Step 3: Run the install command again without cleaning.
# The unpacked jar from the previous step will persist in `target/classes`. 
# Unpack the shaded jar into a different directory (e.g., second-classes) and check the import.
# The incorrect import will be: 
# `import io.grpc.netty.shaded.io.grpc.netty.shaded.io.netty.handler.logging.ByteBufFormat;`.
mvn install
unzip $M2_REPO/org/apache/bookkeeper/metadata/drivers/jetcd-core-shaded/4.18.0-SNAPSHOT/jetcd-core-shaded-4.18.0-SNAPSHOT-shaded.jar \
  -d metadata-drivers/jetcd-core-shaded/target/second-classes

# Step 4: Use IntelliJ IDEA's "Compare Directories" tool to compare the `first-classes` 
# and `second-classes` directories. The differences in imports should become apparent.
```

We can remove the attach and unpack configurations, and it should work fine.

This issue has already been [reported][3] in apache/pulsar, and a similar [patch][patch] has addressed it.

[3]: apache/pulsar#23513
[patch]: apache/pulsar#23604
lhotari pushed a commit that referenced this pull request Nov 18, 2024
…due to jetc-core sharding problem (#23604)

Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
(cherry picked from commit 89ccb73)
lhotari pushed a commit that referenced this pull request Nov 18, 2024
…due to jetc-core sharding problem (#23604)

Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
(cherry picked from commit 89ccb73)
lhotari pushed a commit that referenced this pull request Nov 18, 2024
…due to jetc-core sharding problem (#23604)

Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
(cherry picked from commit 89ccb73)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Nov 20, 2024
…due to jetc-core sharding problem (apache#23604)

Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
(cherry picked from commit 89ccb73)
(cherry picked from commit 23f1ef0)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Nov 21, 2024
…due to jetc-core sharding problem (apache#23604)

Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
(cherry picked from commit 89ccb73)
(cherry picked from commit 23f1ef0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment