-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-32585] Use flink-connector-pulsar shade pattern #23003
Conversation
Signed-off-by: tison <wander4096@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit wary about modifying existing entries; wouldn't that prevent the pulsar 2.x connector version from working against 1.18?
@zentol IIUC But I notice that change the relocate pattern may not fix the real class not found issue, so if it's proven I'll close this PR. |
Can you link some ticket or context for the problem? |
@zentol this is the related PR - apache/flink-connector-pulsar#54 I'm filing a ticket now. It seems without this new shade pattern tests always failed. According to your suggestions above, duplicating the before and after shade patterns can be a solution? |
Signed-off-by: tison <wander4096@gmail.com>
I'd love to get a summary of what works and what doesn't with what changes to the shade-plugin / ci-tools; I can't wrap my head around the current state. The PR description states that the original PR failed with a CNFE for a relocated netty class, using the original pulsar relocation. If some class is missing from pulsar then another round of relocations shouldn't make a difference unless you also bundle another netty version and relocate that as well. |
Unfortunately, it's complicated on this packing issue to explain in several words.
I checked the bundled 'flink-sql-connector-pulsar' JAR and it contains -
... also the pulsar-client-all 3.0
I don't know, actually. If a class loading issue happened, it can be a name resolution issue. Then I found it strange we shade "flink-connector-pulsar" classes as "org.apache.pulsar.shade" which can be conflicted with the Pulsar bundled one. Since the connector use its own classes, it should use its own shade patten in the beginning. Now the result is that without relocation, tests always failed. With the change, it can pass - apache/flink-connector-pulsar#54 Given the proposed shade pattern changed isn't a hack but what it should be initially done (IMO), it doesn't cause regressions. |
It seems the failure is related to other exceptions instead of CNFE. I'll check for details. It's strange that it's related to the change, but if an initial correct solution can resolve the issue, it is not a hack to owe some debts. |
From another aspect, if with the shade pattern changes we can avoid the issue, but we cannot update LicenseChecker so that we change the shade pattern particially (like apache/flink-connector-pulsar@2cb31bf change only the shaded io.netty can "fix" from the current status), it's a hack to me. |
The original patch failed seems on difference about OOM, perhaps retry helps. I don't know if we can extend the memory limit or it's something that the pulsar connector consume too much memory. |
Closing... Although I'd prefer to use the connector's shade pattern instead of the pulsar's one even without the CI failure, I can spend time to investigate the other issue now. |
The OOM is about class loading. The backtrace is:
... at least relocating don't help. |
This commit apache/flink-connector-pulsar@c78689c change the relocation strategy for flink-connector-pulsar to try to resolve class loading issue. As it doesn't introduce regression, I'd prefer to apply the new shade pattern to all the deps there for consistency.
If we have a suppression file for these checks it can be more smooth, but I'm OK with cross repo updating as long as it's temporary traffic and we can finally converge.
cc @zentol
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes / no)Documentation