[For Review] Pt. 2 - Introduce TopologyMetadata to wrap InternalTopologyBuilders of named topologies#6
Conversation
…ity of life improvements (apache#10762) Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Konstantine Karantasis <k.karantasis@gmail.com>
…e#10710) Removes previously deprecated methods in older KIPs Reviewers: Bruno Cadonna <cadonna@apache.org>
…ceptionWithTimeout` (apache#10759) New parameters in overloaded methods should appear later apart from lambdas that should always be last.
…pache#10684) In Log.collectAbortedTransactions() I've restored a previously used logic, such that it would handle the case where the starting segment could be null. This was the case previously, but the PR apache#10401 accidentally changed the behavior causing the code to assume that the starting segment won't be null. In Log.rebuildProducerState() I've removed usage of the allSegments local variable. The logic looks a bit simpler after I removed it. I've introduced a new LogSegments.higherSegments() API. This is now used to make the logic a bit more readable in Log. collectAbortedTransactions() and Log.deletableSegments() APIs. I've removed the unnecessary use of java.lang.Long in LogSegments class' segments map definition. I've converted a few LogSegments API from public to private, as they need not be public. Reviewers: Ismael Juma <ismael@juma.me.uk>, Cong Ding <cong@ccding.com>, Jun Rao <junrao@gmail.com>
2.10.x is no longer supported, so we should move to 2.12 for the 3.0 release. ScalaObjectMapper has been deprecated and it looks like we don't actually need it, so remove its usage. Reviewers: David Jacot <djacot@confluent.io>
… id clean-ups (apache#10761) Log if deletion fails and don't expose log topic id for mutability outside of `assignTopicId()`. Also remove an unnecessary parameter in `PartitionTest`. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
… KafkaProducer (apache#10704) Recently we have noticed multiple instances where KafkaProducers have failed to constructor due to the following exception: ``` org.apache.kafka.common.KafkaException: Failed to construct kafka producer at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:440) at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:291) at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:318) java.base/java.lang.Thread.run(Thread.java:832) Caused by: java.util.ConcurrentModificationException at java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1584) at java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1607) at java.base/java.util.AbstractSet.removeAll(AbstractSet.java:171) at org.apache.kafka.common.config.AbstractConfig.unused(AbstractConfig.java:221) at org.apache.kafka.common.config.AbstractConfig.logUnused(AbstractConfig.java:379) at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:433) ... 9 more exception.class:org.apache.kafka.common.KafkaException exception.message:Failed to construct kafka producer ``` This is due to the fact that `used` below is a synchronized set. `used` is being modified while removeAll is being called. This is due to the use of RecordingMap in the Sender thread (see below). Switching to a ConcurrentHashSet avoids this issue as it support concurrent iteration. ``` at org.apache.kafka.clients.producer.ProducerConfig.ignore(ProducerConfig.java:569) at org.apache.kafka.common.config.AbstractConfig$RecordingMap.get(AbstractConfig.java:638) at org.apache.kafka.common.network.ChannelBuilders.createPrincipalBuilder(ChannelBuilders.java:242) at org.apache.kafka.common.network.PlaintextChannelBuilder$PlaintextAuthenticator.<init>(PlaintextChannelBuilder.java:96) at org.apache.kafka.common.network.PlaintextChannelBuilder$PlaintextAuthenticator.<init>(PlaintextChannelBuilder.java:89) at org.apache.kafka.common.network.PlaintextChannelBuilder.lambda$buildChannel$0(PlaintextChannelBuilder.java:66) at org.apache.kafka.common.network.KafkaChannel.<init>(KafkaChannel.java:174) at org.apache.kafka.common.network.KafkaChannel.<init>(KafkaChannel.java:164) at org.apache.kafka.common.network.PlaintextChannelBuilder.buildChannel(PlaintextChannelBuilder.java:79) at org.apache.kafka.common.network.PlaintextChannelBuilder.buildChannel(PlaintextChannelBuilder.java:67) at org.apache.kafka.common.network.Selector.buildAndAttachKafkaChannel(Selector.java:356) at org.apache.kafka.common.network.Selector.registerChannel(Selector.java:347) at org.apache.kafka.common.network.Selector.connect(Selector.java:274) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:1097) at org.apache.kafka.clients.NetworkClient.access$700(NetworkClient.java:87) at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:1276) at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:1164) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:637) at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:327) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:242) ``` Reviewers: Ismael Juma <ismael@juma.me.uk>
There was a problem hiding this comment.
Changed from returning a Pattern directly to returning the pattern string, so it's easier to combine multiple pattern strings from different named topologies if necessary (exposed via sourceTopicsPatternString() below)
There was a problem hiding this comment.
This is the main work of this PR -- basically each NamedTopology retains its own InternalTopologyBuilder, and the new TopologyMetadata class wraps them all and handles consolidating the metadata, verifying constraints (eg that input topics are unique), and so on. Most of the changes here are just refactoring all the many things that directly depended on the InternalTopologyBuilder to instead go through this class, and then refactoring InternalTopologyBuilder to better fit in the new framework
There was a problem hiding this comment.
Just want to call this out in case you come across one of these things and wonder if it will work with Named Topologies -- it won't 🙂
wcarlson5
left a comment
There was a problem hiding this comment.
I finished a first pass, I didn't have a ton to say about this. A lot of renaming
| return new Metrics(metricConfig, reporters, time, metricsContext); | ||
| } | ||
|
|
||
| private int getNumStreamThreads(final boolean hasGlobalTopology) { |
streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopologyBuilder.java
Show resolved
Hide resolved
streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopologyBuilder.java
Outdated
Show resolved
Hide resolved
streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopologyBuilder.java
Show resolved
Hide resolved
streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopologyBuilder.java
Outdated
Show resolved
Hide resolved
streams/src/main/java/org/apache/kafka/streams/processor/internals/TopologyMetadata.java
Outdated
Show resolved
Hide resolved
...apache/kafka/streams/processor/internals/namedtopology/KafkaStreamsNamedTopologyWrapper.java
Outdated
Show resolved
Hide resolved
8d8d289 to
4c2e6b8
Compare
4c2e6b8 to
c508638
Compare
…10764) I'm migrating Ranger's kafka plugin from deprecated Authorizer (this is already removed by 976e78e) to new API (see https://issues.apache.org/jira/browse/RANGER-3231). The kafka plugin needs to take something from field resourcePattern but it does not know whether the field is nullable (or users need to add null check). I check all usages and I don't observe any null case. Reviewers: Ismael Juma <ismael@juma.me.uk>
This, upgrades JDK to version 15 for the docs generation, this way we can circumvent bug https://bugs.openjdk.java.net/browse/JDK-8215291 present in JDK11 Reviewers: Ismael Juma <ismael@juma.me.uk>
* Lay the groundwork for migrating KTable Processors to the new PAPI. * Migrate the KTableFilter processor to prove that the groundwork works. This is an effort to help break up apache#10507 into multiple PRs. Reviewers: Boyang Chen <boyang@apache.org>
The following error happens on my mac m1 when building docker image for system tests. Collecting pynacl Using cached PyNaCl-1.4.0.tar.gz (3.4 MB) Installing build dependencies ... error ERROR: Command errored out with exit status 1: command: /usr/bin/python3 /usr/local/lib/python3.8/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-k867aac0/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- 'setuptools>=40.8.0' wheel 'cffi>=1.4.1; python_implementation != '"'"'PyPy'"'"'' cwd: None Complete output (14 lines): Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.8/dist-packages/pip/__main__.py", line 23, in <module> from pip._internal.cli.main import main as _main # isort:skip # noqa File "/usr/local/lib/python3.8/dist-packages/pip/_internal/cli/main.py", line 5, in <module> import locale File "/usr/lib/python3.8/locale.py", line 16, in <module> import re File "/usr/lib/python3.8/re.py", line 145, in <module> class RegexFlag(enum.IntFlag): AttributeError: module 'enum' has no attribute 'IntFlag' ---------------------------------------- ERROR: Command errored out with exit status 1: /usr/bin/python3 /usr/local/lib/python3.8/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-k867aac0/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- 'setuptools>=40.8.0' wheel 'cffi>=1.4.1; python_implementation != '"'"'PyPy'"'"'' Check the logs for full command output. There was a related issue: pypa/pip#9689 and it is already fixed by pypa/pip#9689 (included by pip 21.1.1). I test the pip 21.1.1 and it works well on mac m1. Reviewers: Ismael Juma <ismael@juma.me.uk>
Reviewers: Lee Dongjin <dongjin@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>
Use a caching `BufferSupplier` per request handler thread so that decompression buffers are cached if supported by the underlying `CompressionType`. This achieves a similar outcome as apache#9220, but with less contention. We introduce a `RequestLocal` class to make it easier to introduce new request scoped stateful instances (one example we discussed previously was an `ActionQueue` that could be used to avoid some of the complex group coordinator locking). This is a small win for zstd (no synchronization or soft references) and a more significant win for lz4. In particular, it reduces allocations significantly when the number of partitions is high. The decompression buffer size is typically 64 KB, so a produce request with 1000 partitions results in 64 MB of allocations even if each produce batch is small (likely, when there are so many partitions). I did a quick producer perf local test with 5000 partitions, 1 KB record size, 1 broker, lz4 and ~0.5 for the producer compression rate metric: Before this change: > 20000000 records sent, 346314.349535 records/sec (330.27 MB/sec), 148.33 ms avg latency, 2267.00 ms max latency, 115 ms 50th, 383 ms 95th, 777 ms 99th, 1514 ms 99.9th. After this change: > 20000000 records sent, 431956.113259 records/sec (411.95 MB/sec), 117.79 ms avg latency, 1219.00 ms max latency, 99 ms 50th, 295 ms 95th, 440 ms 99th, 662 ms 99.9th. That's a 25% throughput improvement and p999 latency was reduced to under half (in this test). Default arguments will be removed in a subsequent PR. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Fix compile error in scala tests. The compile error is: ``` [Error] /home/jenkins/jenkins-agent/workspace/Kafka_kafka-pr_PR-9229/core/src/test/scala/unit/kafka/server/metadata/BrokerMetadataListenerTest.scala:97: polymorphic expression cannot be instantiated to expected type; [2021-05-29T02:34:50.308Z] found : [T]()T [2021-05-29T02:34:50.308Z] required: kafka.server.RequestLocal ``` This error happens only in scala 2.12 Reviewers: Bruno Cadonna <cadonna@apache.org>
…support (apache#10783) Release notes: * Scala 2.12.14: https://github.com/scala/scala/releases/tag/v2.12.14 * Scala Logging: https://github.com/lightbend/scala-logging/releases/tag/v3.9.3 * Scala Collection Compat: * https://github.com/scala/scala-collection-compat/releases/tag/v2.3.1 * https://github.com/scala/scala-collection-compat/releases/tag/v2.3.2 * https://github.com/scala/scala-collection-compat/releases/tag/v2.4.0 * https://github.com/scala/scala-collection-compat/releases/tag/v2.4.1 * https://github.com/scala/scala-collection-compat/releases/tag/v2.4.2 * https://github.com/scala/scala-collection-compat/releases/tag/v2.4.3 * https://github.com/scala/scala-collection-compat/releases/tag/v2.4.4 * Scala Java8 Compat: * https://github.com/scala/scala-java8-compat/releases/tag/v1.0.0-RC1 * https://github.com/scala/scala-java8-compat/releases/tag/v1.0.0 Reviewers: Ismael Juma <ismael@juma.me.uk>
apache#10585) The command used by our private CI is ./gradlew cleanTest xxx:test. It does not re-run test when we use unitTest and integrationTest to replace test. The root cause is that we don't offer test output (unitTest and integrationTest) to cleanTest task and so it does not delete related test output. Reviewers: Ismael Juma <ismael@juma.me.uk>
…raft module (apache#10791) The command `./gradlew raft:integrationTest` can't run any integration test since `org.junit.jupiter.api.Tag` does not work for jqwik engine (see jqwik-team/jqwik#36 (comment)). Reviewers: Ismael Juma <ismael@juma.me.uk>
Fix examples under security.html so they use the right bash icon (`>` instead of `$`) and also uses the right tool for showing code listings. Reviewers: Luke Chen <showuon@gmail.com>, Bruno Cadonna <cadonna@apache.org>
…pache#10765) As specified in KIP-743, this PR removes the built-in metrics in Streams that are superseded by the refactoring proposed in KIP-444. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Luke Chen <showuon@gmail.com>
The broker shouldn't assume create access to the chroot. There are deployement scenarios where the chroot is already created is the only znode which the broker can access. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Ron Dagostino <rdagostino@confluent.io>
Also: * Remove `ZookeeperTopicService` * Remove `TopicCommandWithZKClientTest` * Fix a topic create validation bug * Adjust existing tests Reviewers: Ismael Juma <ismael@juma.me.uk>
This patch adds `Admin` support for the `listTransactions` API, which was added by [KIP-664](https://cwiki.apache.org/confluence/display/KAFKA/KIP-664%3A+Provide+tooling+to+detect+and+abort+hanging+transactions). Similar to `listConsumerGroups`, the new `listTransactions` API is intended to be sent to all brokers. Reviewers: David Jacot <djacot@confluent.io>
…nsumerInterceptor interface (apache#10801) Co-authored-by: “KahnCheny” <“kahn.cheny@gmail.com”> Reviewers: Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>
Async profiler 2.0 outputs html5 flame graph files and supports simultaneous collection of cpu, allocation and lock profiles in jfr format. Updated the readme to include an example of the latter and verified that the Readme commands work with async profiler 2.0. Release notes: * 1.28: https://mail.openjdk.java.net/pipermail/jmh-dev/2021-March/003171.html * 1.29: https://mail.openjdk.java.net/pipermail/jmh-dev/2021-March/003218.html * 1.30: https://mail.openjdk.java.net/pipermail/jmh-dev/2021-May/003237.html * 1.31: https://mail.openjdk.java.net/pipermail/jmh-dev/2021-May/003286.html * 1.32: https://mail.openjdk.java.net/pipermail/jmh-dev/2021-May/003307.html Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>, Luke Chen <showuon@gmail.com>
…moved option (apache#10806) Reviewers: Luke Chen <showuon@gmail.com>, Ismael Juma <ismael@juma.me.uk> Co-authored-by: shenwenbing <shenwenbing@qianxin.com>
…fig (apache#10797) The trogdor ConsumeBenchWorker allows several consumption tasks to be run in parallel, the number is configurable using the threadsPerWorker config. If one of the consumption tasks completes executing successfully due to maxMessages being consumed, then, the consumption task prematurely notifies the doneFuture causing the entire ConsumeBenchWorker to halt. This becomes a problem when more than 1 consumption task is running in parallel, because the successful completion of 1 of the tasks shuts down the entire worker while the other tasks are still running. When the worker is shut down, it kills all the active consumption tasks, though they have not consumed maxMessages yet. This commit defers notification of the doneFuture to the CloseStatusUpdater thread, which is already responsible for tracking the status of the tasks and updating their status when all of the tasks complete. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
…che#10808) They were both deprecated in Apache Kafka 2.4 and it's a straightforward change to use the non deprecated variants. Reviewers: David Jacot <djacot@confluent.io>
…ormance (apache#10552) I did code refactor/optimization, keep the same algorithm in this PR. Originally, With this setting: topicCount = 50; partitionCount = 800; consumerCount = 800; We complete in 10 seconds, after my code refactor, the time down to 100~200 ms With the 1 million partitions setting: topicCount = 500; partitionCount = 2000; consumerCount = 2000; No OutOfMemory will be thrown anymore. The time will take 4~5 seconds. Reviewers: Vahid Hashemian <vahid.hashemian@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
…pache#10787) Since KafkaEventQueue is a generic data structure not specific to metadata, move it into the server-common module. Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Matthias J. Sax <matthias@confluent.io>
…#10664) Refactored logConfig to be passed appropriately when using shutDownWhenFull or emitEarlyWhenFull. Removed the constructor that doesn't accept a logConfig parameter so you're forced to specify it explicitly, whether it's empty/unspecified or not. Co-authored-by: Bruno Cadonna <cadonna@apache.org> Reviewers: Walker Carlson <wcarlson@confluent.io>, Bruno Cadonna <cadonna@apache.org>
This is part 2 of KIP-730. Part 1 was in apache#10504. This PR adds QuorumController support for handling AllocateProducerIDs requests and managing the state of the latest producer ID block in the controller by committing this state to the metadata log. Reviewers: Colin P. McCabe <cmccabe@apache.org>
Style fixes to KafkaRaftClient Reviewers: Luke Chen <showuon@gmail.com>
…nsCommand (apache#10471) Also remove zookeeper dependent methods and tests. Reviewers: Ismael Juma <ismael@juma.me.uk>
1. When register state stores, add the store to globalStateStores before calling any blocking calls that may throw errors, so that upon closing we would close the stores as well. 2. Remove the other list as a local field, and call topology.globalStateStores when needed to get the list. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Bruno Cadonna <cadonna@apache.org>
Reviewers: Jun Rao <junrao@gmail.com>
Having the `testChrootExistsAndRootIsLocked` test in a separate `ZookeeperTestHarness` isn't enough to prevent the ACL changes to the ZK root from affecting other integration tests. So instead, let's use a dedicated znode for this test. It still works because `makeSurePersistentPathExists` uses `createRecursive`, which will recurse and act the same for the root or a given znode. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
…he#10817) Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
…er (apache#9441) Co-authored-by: Jason Gustafson<jason@confluent.io> Reviewers: Jason Gustafson<jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
`MetadataParser` is a duplication of `MetadataRecordSerde` and it's not used in any code, so we can remove it. It did, however, have some useful validations which have been moved into `MetadataRecordSerde`. Reviewers: Jason Gustafson <jason@confluent.io>
…#10823) apache#10494 introduced a bug in the KRaft controller where the controller will loop forever in StripedReplicaPlacer trying to identify the racks on which to place partition replicas if there is a single unfenced broker in the cluster and the number of requested partitions in a CREATE_TOPICS request is greater than 1. This patch refactors out some argument sanity checks and invokes those checks in both RackList and StripedReplicaPlacer, and it adds tests for this as well as the single broker placement issue. Reviewers: Jun Rao <junrao@gmail.com>
…y structure (apache#10609) This PR includes adding the NamedTopology to the Subscription/AssignmentInfo, and to the StateDirectory so it can place NamedTopology tasks within the hierarchical structure with task directories under the NamedTopology parent dir. Reviewers: Walker Carlson <wcarlson@confluent.io>, Guozhang Wang <guozhang@confluent.io>
|
Closing this since Pt. 1 is now merged, please move discussion to apache#10683 |
Same as apache#10683 but targeted against apache#10609 to facilitate review and filter out everything except for the changes in this PR alone