Skip to content

KAFKA-12940: Enable JDK 16 builds in Jenkins#10702

Merged
ijuma merged 7 commits intoapache:trunkfrom
ijuma:java-16-jenkins
Jun 13, 2021
Merged

KAFKA-12940: Enable JDK 16 builds in Jenkins#10702
ijuma merged 7 commits intoapache:trunkfrom
ijuma:java-16-jenkins

Conversation

@ijuma
Copy link
Member

@ijuma ijuma commented May 15, 2021

JDK 15 no longer receives updates, so we want to switch from JDK 15 to JDK 16.
However, we have a number of tests that don't yet pass with JDK 16.

Instead of replacing JDK 15 with JDK 16, we have both for now and we either
disable (via annotations) or exclude (via gradle) the tests that don't pass with
JDK 16 yet. The annotations approach is better, but it doesn't work for tests
that rely on the PowerMock JUnit 4 runner.

Also add --illegal-access=permit when building with JDK 16 to make MiniKdc
work for now. This has been removed in JDK 17, so we'll have to figure out
another solution when we migrate to that.

Relevant JIRAs for the disabled tests: KAFKA-12790, KAFKA-12941, KAFKA-12942.

Moved some assertions from testTlsDefaults to testUnsupportedTlsVersion
since the former claims to test the success case while the former tests the failure case.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@ijuma ijuma force-pushed the java-16-jenkins branch 2 times, most recently from b09f32f to 643ba4c Compare May 15, 2021 13:26
@ijuma
Copy link
Member Author

ijuma commented May 15, 2021

@chia7712 One of the obstacles in switching to JDK 16 is that PowerMock tests are failing. There are a lot of such tests on the connect modules, but just one on the streams module: KafkaStreamsTest. If you have cycles and interest, removing the PowerMock usage would be handy.

@jlprat
Copy link
Contributor

jlprat commented May 31, 2021

For reference, there is an issue in PowerMock for this: powermock/powermock#1099

ijuma added 2 commits June 11, 2021 07:52
JDK 15 no longer receives updates.

`SslTransportLayer.testUnsupportedTlsVersion` fails with JDK 16, so it's
currently disabled with that version.
This makes some of MiniKdc tests work again.
@ijuma ijuma force-pushed the java-16-jenkins branch from a9efbb5 to c3534ec Compare June 11, 2021 16:34
@ijuma ijuma changed the title MINOR: Switch Jenkins from JDK 15 to JDK 16 MINOR: Enable JDK 16 builds in Jenkins Jun 11, 2021
@ijuma ijuma changed the title MINOR: Enable JDK 16 builds in Jenkins KAFKA-12940: Enable JDK 16 builds in Jenkins Jun 11, 2021
@ijuma ijuma marked this pull request as ready for review June 11, 2021 21:57
@ijuma ijuma requested a review from chia7712 June 11, 2021 21:57
@ijuma ijuma force-pushed the java-16-jenkins branch from b64ed3c to 2287890 Compare June 11, 2021 22:20
@ijuma ijuma force-pushed the java-16-jenkins branch from 2287890 to 905c252 Compare June 11, 2021 22:22
@ijuma
Copy link
Member Author

ijuma commented Jun 11, 2021

The last build had one unrelated JDK 16 failure:

Build / JDK 16 and Scala 2.13 / kafka.server.RaftClusterTest.testCreateClusterAndCreateAndManyTopics()

I fixed an issue affecting JDK 8 builds, so the next run should be good:

https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-10702/7/

PR ready to be reviewed.

@ijuma
Copy link
Member Author

ijuma commented Jun 12, 2021

Unrelated flaky failures:

Build / JDK 8 and Scala 2.12 / kafka.server.RaftClusterTest.testCreateClusterAndCreateAndManyTopicsWithManyPartitions() | Build / JDK 15 and Scala 2.13 / kafka.server.RaftClusterTest.testCreateClusterAndCreateListDeleteTopic()
Build / JDK 15 and Scala 2.13 / org.apache.kafka.streams.integration.SmokeTestDriverIntegrationTest.shouldWorkWithRebalance
Build / JDK 16 and Scala 2.13 / org.apache.kafka.streams.integration.SmokeTestDriverIntegrationTest.shouldWorkWithRebalance
Build / JDK 16 and Scala 2.13 / org.apache.kafka.streams.integration.SmokeTestDriverIntegrationTest.shouldWorkWithRebalance

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. a couple of trivial comments below.

Jenkinsfile Outdated
echo 'Skipping Kafka Streams archetype test for Java 15'
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant whitespace.

build.gradle Outdated
"**/AbstractHerderTest.*", "**/ConnectClusterStateImplTest.*", "**/ConnectorPluginsResourceTest.*",
"**/ConnectorsResourceTest.*", "**/DistributedHerderTest.*", "**/FileOffsetBakingStoreTest.*",
"**/ErrorHandlingTaskTest.*", "**/KafkaConfigBackingStoreTest.*", "**/KafkaOffsetBackingStoreTest.*",
"**/KafkaBasedLogTest.*", "**/OffsetStorageWriterTest.*", "**/StandaloneHerderTest.*",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant whitespace

@ijuma ijuma merged commit 530224e into apache:trunk Jun 13, 2021
@ijuma ijuma deleted the java-16-jenkins branch June 13, 2021 15:14
mjsax added a commit to confluentinc/kafka that referenced this pull request Jun 15, 2021
Resolve merge conflicts in Jenkins file.


* MINOR: clean up unneeded `@SuppressWarnings` (apache#10855)

Reviewers: Luke Chen <showuon@gmail.com>, Matthias J. Sax <mjsax@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>

* KAFKA-12940: Enable JDK 16 builds in Jenkins (apache#10702)

JDK 15 no longer receives updates, so we want to switch from JDK 15 to JDK 16.
However, we have a number of tests that don't yet pass with JDK 16.

Instead of replacing JDK 15 with JDK 16, we have both for now and we either
disable (via annotations) or exclude (via gradle) the tests that don't pass with
JDK 16 yet. The annotations approach is better, but it doesn't work for tests
that rely on the PowerMock JUnit 4 runner.

Also add `--illegal-access=permit` when building with JDK 16 to make MiniKdc
work for now. This has been removed in JDK 17, so we'll have to figure out
another solution when we migrate to that.

Relevant JIRAs for the disabled tests: KAFKA-12790, KAFKA-12941, KAFKA-12942.

Moved some assertions from `testTlsDefaults` to `testUnsupportedTlsVersion`
since the former claims to test the success case while the former tests the failure case.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

* KAFKA-12921: Upgrade zstd-jni to 1.5.0-2 (apache#10847)

This PR aims to upgrade `zstd-jni` from `1.4.9-1` to `1.5.0-2`.

This change will incorporate a number of bug fixes and performance improvements made in `1.5.0` of `zstd`:
- https://github.com/facebook/zstd/releases/tag/v1.5.0
- https://github.com/luben/zstd-jni/releases/tag/v1.5.0-1
- https://github.com/luben/zstd-jni/releases/tag/v1.5.0-2

The most recent `1.5.0` release offers +25%-140% (compression) and +15% (decompression) performance
improvements under certain conditions. Those conditions are unlikely to apply to Kafka with the default
configuration, however.

Since this is a dependency change, this should pass all the existing CIs.

Reviewers: Lee Dongjin <dongjin@apache.org>, Ismael Juma <ismael@juma.me.uk>

* KAFKA-8940: decrease session timeout to make test faster and reliable (apache#10871)

While there might still be some issue about the test as described here by @ableegoldman , but I found the reason why this test failed quite frequently recently. It's because we increased the session timeout to 45 sec in KIP-735.

The reason why increasing session timeout affected this test is because in this test, we will keep adding new stream clients and remove old one, to maintain only 3 stream clients alive. The problem here is, when old stream closed, we won't trigger rebalance immediately due to the stream clients are all static members as described in KIP-345, which means, we will trigger trigger group rebalance only when session.timeout expired. That said, when old client closed, we'll have at least 45 sec with some tasks not working.

Also, in this test, we have 2 timeout conditions to fail this test before verification passed:

1. 6 minutes timeout
2. polling 30 times (each with 5 seconds) without getting any data. (that is, 5 * 30 = 150 sec without consuming any data)

For (1), in my test under 45 session timeout, we'll create 8 stream clients, which means, we'll have 5 clients got closed. And each closed client need 45 sec to trigger rebalance, so we'll have 45 * 5 = 225 sec (~4 mins) of the time having some tasks not working.
For (2), during new client created and old client closed, it need some time to do rebalance. With 45 session timeout, we only got ~100 sec left. In slow jenkins env, it might reach the 30 retries without getting any data timeout.

Therefore, decreasing session timeout can make this test completes faster and more reliable.

Reviewers: Guozhang Wang <wangguoz@gmail.com>

* MINOR: enable EOS during smoke test IT (apache#10870)

This IT has been failing on trunk recently. Enabling EOS during the integration test
makes it easier to be sure that the test's assumptions are really true during verification
and should make the test more reliable.

I also noticed that in the actual system test file, we are using the deprecated property
name "beta" instead of "v2".

Reviewers: Boyang Chen <boyang@apache.org>

* MINOR: Log formatting for exceptions during configuration related operations (apache#10843)

Format configuration logging during exceptions or errors. Also make sure it redacts sensitive information or unknown values.

Reviewers: Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>

* KAFKA-12914: StreamSourceNode should return `null` topic name for pattern subscription (apache#10846)

Reviewers: Luke Chen <showuon@gmail.com>, Bruno Cadonna <bruno@confluent.io>, Guozhang Wang <guozhang@confluent.io>

* KAFKA-12948: Remove node from ClusterConnectionStates.connectingNodes when node is removed (apache#10882)

NetworkClient.poll() throws IllegalStateException when checking isConnectionSetupTimeout if all nodes in ClusterConnectionStates.connectingNodes aren't present in ClusterConnectionStates.nodeState. This commit ensures that when we remove a node from nodeState, we also remove from connectingNodes.

Reviewers: David Jacot <djacot@confluent.io>

* KAFKA-12701: NPE in MetadataRequest when using topic IDs (apache#10584)

We prevent handling MetadataRequests where the topic name is null (to prevent NPE) as
well as prevent requests that set topic IDs since this functionality has not yet been
implemented. When we do implement it  in apache#9769,
we should bump the request/response version.

Added tests to ensure the error is thrown.

Reviewers: dengziming <swzmdeng@163.com>, Ismael Juma <ismael@juma.me.uk>

Co-authored-by: Josep Prat <josep.prat@aiven.io>
Co-authored-by: Ismael Juma <ismael@juma.me.uk>
Co-authored-by: David Christle <dchristle@users.noreply.github.com>
Co-authored-by: Luke Chen <showuon@gmail.com>
Co-authored-by: John Roesler <vvcephei@users.noreply.github.com>
Co-authored-by: YiDing-Duke <dingyi.zj@gmail.com>
Co-authored-by: Rajini Sivaram <rajinisivaram@googlemail.com>
Co-authored-by: Justine Olshan <jolshan@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants