Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size #12486

Merged
merged 2 commits into from
Aug 9, 2022

Conversation

cmccabe
Copy link
Contributor

@cmccabe cmccabe commented Aug 5, 2022

BrokerMetadataSnapshotter should split up record lists that exceed the batch size.

BrokerMetadataSnapshotter should split up record lists that exceed the batch size.
Copy link
Member

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cmccabe, just one main question -- is it okay to break up an atomic batch into two Lists when writing to a snapshot? Do we ever rely on reading an atomic batch of messages as one List when reading snapshots? E.g., what happens if we're dealing with a topic with 2000 partitions.

* you also change the equivalent constant in the Raft layer, and have considered the
* compatibility issues.
*/
private val maxBatchSize = 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reference a constant for this special value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I added some more documentation here

trait SnapshotWriterBuilder {
def build(committedOffset: Long,
committedEpoch: Int,
lastContainedLogTime: Long): Option[SnapshotWriter[ApiMessageAndVersion]]
}

class RecordListConsumer(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a quick javadoc of what this class does? Also, maybe a more descriptive name (BatchingRecordListConsumer?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@cmccabe
Copy link
Contributor Author

cmccabe commented Aug 5, 2022

Thanks @cmccabe, just one main question -- is it okay to break up an atomic batch into two Lists when writing to a snapshot? Do we ever rely on reading an atomic batch of messages as one List when reading snapshots? E.g., what happens if we're dealing with a topic with 2000 partitions.

No, we do not rely on this in the snapshot. The decision of what batches to use in the snapshot doesn't change semantics.

In the log (as opposed to snapshot) we care about batching because batches are atomically committed, so if the controller fails midway into writing a multibatch operation we have to think about what happens next (probably a partial commit of whatever was going on)

Copy link
Member

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! There was some issue with the last build, so I re-started. LGTM pending a good build.

@cmccabe cmccabe merged commit e67711a into apache:trunk Aug 9, 2022
cmccabe added a commit that referenced this pull request Aug 9, 2022
)

BrokerMetadataSnapshotter should split up record lists that exceed the batch size.

Reviewers: David Arthur <mumrah@gmail.com>
@cmccabe cmccabe deleted the faults-III branch August 9, 2022 20:27
ijuma added a commit to confluentinc/kafka that referenced this pull request Aug 10, 2022
…(10 August 2022)

Trivial conflict in gradle/dependencies.gradle due to the newer Netty
version in confluentinc/kafka.

* apache-github/trunk:
MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies
(apache#12495)
KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is
not eligible to join ISR in ZK mode (apache#12487)
  KAFKA-14114: Add Metadata Error Related Metrics
MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size
(apache#12486)
  MINOR: Upgrade mockito test dependencies (apache#12460)
KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing
partition epoch (apache#12489)
KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest
(apache#12472)
  MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477)
KAFKA-14104; Add CRC validation when iterating over Metadata Log
Records (apache#12457)
  MINOR: add :server-common test dependency to :storage (apache#12488)
  KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440)
  KAFKA-14124: improve quorum controller fault handling (apache#12447)
ijuma added a commit to franz1981/kafka that referenced this pull request Aug 12, 2022
* apache-github/trunk: (447 commits)
  KAFKA-13959: Controller should unfence Broker with busy metadata log (apache#12274)
  KAFKA-10199: Expose read only task from state updater (apache#12497)
  KAFKA-14154; Return NOT_CONTROLLER from AlterPartition if leader is ahead of controller (apache#12506)
  KAFKA-13986; Brokers should include node.id in fetches to metadata quorum (apache#12498)
  KAFKA-14163; Retry compilation after zinc compile cache error (apache#12507)
  Remove duplicate common.message.* from clients:test jar file (apache#12407)
  KAFKA-13060: Replace EasyMock and PowerMock with Mockito in WorkerGroupMemberTest.java (apache#12484)
  Fix the rate window size calculation for edge cases (apache#12184)
  MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495)
  KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487)
  KAFKA-14114: Add Metadata Error Related Metrics
  MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486)
  MINOR: Upgrade mockito test dependencies (apache#12460)
  KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489)
  KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472)
  MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477)
  KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457)
  MINOR: add :server-common test dependency to :storage (apache#12488)
  KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440)
  KAFKA-14124: improve quorum controller fault handling (apache#12447)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants