Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump to Pulsar 2.8.2 #190

Closed
wants to merge 3 commits into from
Closed

Conversation

315157973
Copy link

Motivation

Bump to Pulsar 2.8.2

@codelipenghui Please help me push the docker images to the docker hub, and then review this PR. Thanks.

@315157973 315157973 self-assigned this Jan 1, 2022
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #180 should be merged before this change.

@lhotari lhotari force-pushed the bump-to-282 branch 2 times, most recently from 3a09b3d to f136d14 Compare January 3, 2022 11:12
@lhotari
Copy link
Member

lhotari commented Jan 3, 2022

It seems that #188 should also be resolved before switching to Pulsar 2.8.x .

@lhotari
Copy link
Member

lhotari commented Jan 3, 2022

Failures in CI which points to #188. I'll push a fix to this PR.

Unrecognized VM option 'PrintGCTimeStamps'                    
Error: Could not create the Java Virtual Machine.             
Error: A fatal exception has occurred. Program will exit.     
Unrecognized VM option 'PrintGCTimeStamps'                    
Error: Could not create the Java Virtual Machine.             
Error: A fatal exception has occurred. Program will exit.     

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postpone merging until there's a decision whether we should go forward with switching to Pulsar 2.8.x in the helm chart because of a known issue with Zookeeper (https://issues.apache.org/jira/browse/ZOOKEEPER-3988 / apache/pulsar#11070 ).

Mailing list discussion: https://lists.apache.org/thread/619tpn6q5xbbhngwsmhtq3121vhjxpt4

@eolivelli
Copy link
Contributor

Also, why should we ship 2.8 as default instead of 2.9.1?

@lhotari
Copy link
Member

lhotari commented Jan 4, 2022

Also, why should we ship 2.8 as default instead of 2.9.1?

I guess it could be about doing one step at a time (2.7.x -> 2.8.x, then 2.8.x->2.9.x) and also about stability.

@lhotari
Copy link
Member

lhotari commented Jan 4, 2022

I created a PR #195 which adds better logging to CI which would help investigating CI failures. I have observed the "ZK TLS Only" CI job failing, presumably with the known problem. However it would be nice to see the logs too once #195 is reviewed and merged.

@ckdarby
Copy link

ckdarby commented Jan 4, 2022

Also, why should we ship 2.8 as default instead of 2.9.1?

I guess it could be about doing one step at a time (2.7.x -> 2.8.x, then 2.8.x->2.9.x) and also about stability.

It would also be community friendly to do 2.8.X as a helm release version and then 2.9.X as it allows the community to run 2.8 from a helm chart and not need to override tags.

This was referenced Jan 5, 2022
@lhotari
Copy link
Member

lhotari commented Jan 17, 2022

I have created #202 as a workaround for the Zookeeper issue (when TLS is enabled).

@lhotari
Copy link
Member

lhotari commented Jan 26, 2022

It's possible that the Zookeeper issue is simply caused by the probe getting stuck.
The changes in #179 fix the issue for 1.20+ . I'll send a separate PR to address the issue for Kubernetes <1.20.

@lhotari
Copy link
Member

lhotari commented Jan 26, 2022

Rebased after #214 changes. Let's see if the Zookeeper TLS tests pass now.

@lhotari
Copy link
Member

lhotari commented Jan 26, 2022

All tests pass now. I'll inform about this on the dev mailing list thread.

@lhotari
Copy link
Member

lhotari commented Jan 26, 2022

Closing and re-opening to run the tests one more time to see that the problem is fixed.

@lhotari lhotari closed this Jan 26, 2022
@lhotari lhotari reopened this Jan 26, 2022
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problems with ZK & TLS aren't resolved yet.

@lhotari
Copy link
Member

lhotari commented Jan 26, 2022

I'm hoping to get more logs from the failure after #215 changes are in place in CI.

@lhotari
Copy link
Member

lhotari commented Jan 27, 2022

I finally got some logs from CI. It looks like the metadata initialization job is stuck. I'm adding timeouts in #218 to see if that could help mitigate the issue.

I'm aware of a Zookeeper fix in the works for https://issues.apache.org/jira/browse/ZOOKEEPER-3988 . I haven't been able to find signs of ZOOKEEPER-3988 issue in the logs that were collected from the failing CI jobs.

I uploaded the relevant log files to a gist, https://gist.github.com/lhotari/d1bf977cfbd5f1fdbd942df9fef3c952

@eolivelli Could you check if the logs match ZOOKEEPER-3988?

To me it looks like another issue. I found https://issues.apache.org/jira/browse/ZOOKEEPER-3466 , https://issues.apache.org/jira/browse/ZOOKEEPER-3828 and https://issues.apache.org/jira/browse/ZOOKEEPER-3706 . I wonder if it is ZOOKEEPER-3706, fixed by apache/zookeeper#1235 . @eolivelli Would it be possible to get this fix to the next Zookeeper release that we could use for Pulsar 2.8.x and above?

@lhotari
Copy link
Member

lhotari commented Feb 1, 2022

Promising changes from @frederic-kneier in apache/pulsar#14088 . It might resolve the issue that has been blocking us from upgrading to Pulsar 2.8.x in the Helm chart.

@lhotari
Copy link
Member

lhotari commented Feb 17, 2022

Rebased on latest master branch changes.

@frankjkelly
Copy link
Contributor

@lhotari are any of these changes (esp. the JVM setting changes) backwards incompatible with 2.7.x Pulsar? Just wondering if I can mix-and-match this 2.8.x helm chart and override the image tags with a 2.7.x pulsar image? Thanks!

@michaeljmarshall
Copy link
Member

We've moved passed 2.8, so I am going to close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unrecognized VM option 'PrintGCTimeStamps' - Unsupported Java 11 GC Flags causing init container fail
6 participants