Skip to content

KAFKA-14557; Lock metadata log dir#13058

Merged
jsancio merged 4 commits intoapache:trunkfrom
jsancio:kafka-14557-metadata-lock
Jan 10, 2023
Merged

KAFKA-14557; Lock metadata log dir#13058
jsancio merged 4 commits intoapache:trunkfrom
jsancio:kafka-14557-metadata-lock

Conversation

@jsancio
Copy link
Member

@jsancio jsancio commented Dec 30, 2022

This change makes sure that Kafka grabs a log dir lock in the following additional cases:

  1. When a Kafka node runs in controller only. The current implementation doesn't grab a file lock because the LogManager is never instantiated.
  2. When the metadata log dir is different from the log dir(s). The current implementation of LogManager doesn't load or grab a lock on the metadata dir.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@jsancio jsancio marked this pull request as ready for review December 30, 2022 00:56
Copy link
Member

@dengziming dengziming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left 2 comments.

val raftManager = createRaftManager(
new TopicPartition("__raft_id_test", 0),
createConfig(
Set(BrokerRole),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are testing controller error so should this be ControllerRole.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to test having different metadata.log.dir and log.dirs with the broker. The matrix in testLogDirLockWhenControllerOnly already tests having different metadata.log.dir and log.dirs for the controller.

netChannel.close()
replicatedLog.close()

dataDirLock.foreach(_.destroy())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In LogManager we do this in a finally block, should we do it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I'll add CoreUtils.swallow to all of this calls.

@ijuma ijuma requested a review from hachikuji January 3, 2023 15:08
@jsancio
Copy link
Member Author

jsancio commented Jan 4, 2023

I should mention that the long term solution would be to extend the LogManager to support the kraft metadata log but there are issues like https://issues.apache.org/jira/browse/KAFKA-14241 that need to be fixed to make that possible.

@jsancio jsancio force-pushed the kafka-14557-metadata-lock branch from 5a0630d to 9a36355 Compare January 4, 2023 22:29
Copy link
Contributor

@rondagostino rondagostino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a few comments.

props.setProperty(KafkaConfig.ListenersProp, "PLAINTEXT://localhost:9092,SSL://localhost:9093")
props.setProperty(KafkaConfig.QuorumVotersProp, s"${nodeId}@localhost:9093")
} else { // broker-only
val voterId = (nodeId.toInt + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/(nodeId.toInt + 1)/nodeId + 1/

def testNodeIdPresentIfColocated(): Unit = {
val raftManager = instantiateRaftManagerWithConfigs(new TopicPartition("__raft_id_test", 0), "controller,broker", "1")
assertEquals(1, raftManager.client.nodeId.getAsInt)
def testLogDirLockWhenMetadataDir(): Unit = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better name might be testLogDirLockWhenBrokerOnlyWithSeparateMetadataDir

Comment on lines 130 to 142
@ValueSource(strings = Array("metadata", "log", "metadata,log"))
def testLogDirLockWhenControllerOnly(dirType: String): Unit = {
val logDir = if (dirType.contains("metadata")) {
Some(TestUtils.tempDir().toPath)
} else {
None
}

val metadataDir = if (dirType.contains("log")) {
Some(TestUtils.tempDir().toPath)
} else {
None
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be clearer.

  @ValueSource(strings = Array("metadata-only", "log-only", "both"))
  def testLogDirLockWhenControllerOnly(dirType: String): Unit = {
    val logDir = if (!dirType.equals("metadata-only")) {
      Some(TestUtils.tempDir().toPath)
    } else {
      None
    }

    val metadataDir = if (!dirType.equals("log-only")) {
      Some(TestUtils.tempDir().toPath)
    } else {
      None
    }

Copy link
Contributor

@rondagostino rondagostino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one minor comment.

val metadataDir = if (dirType.contains("log")) {
Some(TestUtils.tempDir().toPath)
} else {
val metadataDir = if (dirType.contains("log-only")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/contains/equals/

@jsancio jsancio merged commit 2fc1875 into apache:trunk Jan 10, 2023
@jsancio jsancio deleted the kafka-14557-metadata-lock branch January 10, 2023 18:18
jsancio added a commit that referenced this pull request Jan 10, 2023
This change makes sure that Kafka grabs a log dir lock in the following additional cases:

1. When a Kafka node runs in controller only. The current implementation doesn't grab a file lock because the LogManager is never instantiated.
2. When the metadata log dir is different from the log dir(s). The current implementation of LogManager doesn't load or grab a lock on the metadata dir.

Reviewers: Ron Dagostino <rdagostino@confluent.io> , dengziming <dengziming1993@gmail.com>
jsancio added a commit that referenced this pull request Jan 10, 2023
This change makes sure that Kafka grabs a log dir lock in the following additional cases:

1. When a Kafka node runs in controller only. The current implementation doesn't grab a file lock because the LogManager is never instantiated.
2. When the metadata log dir is different from the log dir(s). The current implementation of LogManager doesn't load or grab a lock on the metadata dir.

Reviewers: Ron Dagostino <rdagostino@confluent.io> , dengziming <dengziming1993@gmail.com>
ijuma added a commit to fvaleri/kafka that referenced this pull request Jan 13, 2023
* apache-github/trunk:
  KAFKA-14601: Improve exception handling in KafkaEventQueue apache#13089
  KAFKA-14367; Add `OffsetCommit` to the new `GroupCoordinator` interface (apache#12886)
  KAFKA-14530: Check state updater more often (apache#13017)
  KAFKA-14304 Use boolean for ZK migrating brokers in RPC/record (apache#13103)
  KAFKA-14003 Kafka Streams JUnit4 to JUnit5 migration part 2 (apache#12301)
  KAFKA-14607: Move Scheduler/KafkaScheduler to server-common (apache#13092)
  KAFKA-14367; Add `OffsetFetch` to the new `GroupCoordinator` interface (apache#12870)
  KAFKA-14557; Lock metadata log dir (apache#13058)
  MINOR: Implement toString method for TopicAssignment and PartitionAssignment (apache#13101)
  KAFKA-12558: Do not prematurely mutate internal partition state in Mirror Maker 2 (apache#11818)
  KAFKA-14540: Fix DataOutputStreamWritable#writeByteBuffer (apache#13032)
  KAFKA-14600: Reduce flakiness in ProducerIdExpirationTest (apache#13087)
  KAFKA-14279: Add 3.3.x streams system tests (apache#13077)
  MINOR: bump streams quickstart pom versions and add to list in gradle.properties (apache#13064)
  MINOR: Update KRaft cluster upgrade documentation for 3.4 (apache#13063)
  KAFKA-14493: Introduce Zk to KRaft migration state machine STUBs in KRaft controller. (apache#12998)
  KAFKA-14570: Fix parenthesis in verifyFullFetchResponsePartitions output (apache#13072)
  MINOR: Remove public mutable fields from ProducerAppendInfo (apache#13091)
ijuma added a commit to confluentinc/kafka that referenced this pull request Jan 17, 2023
…master

* apache-github/trunk: (23 commits)
  MINOR: Include the inner exception stack trace when re-throwing an exception (apache#12229)
  MINOR: Fix docs to state that sendfile implemented in `TransferableRecords` instead of `MessageSet` (apache#13109)
  Update ProducerConfig.java (apache#13115)
  KAFKA-14618; Fix off by one error in snapshot id (apache#13108)
  KAFKA-13709 (follow-up): Avoid mention of 'exactly-once delivery' or 'delivery guarantees' in Connect (apache#13106)
  KAFKA-14367; Add `TxnOffsetCommit` to the new `GroupCoordinator` interface (apache#12901)
  KAFKA-14568: Move FetchDataInfo and related to storage module (apache#13085)
  KAFKA-14612: Make sure to write a new topics ConfigRecords to metadata log iff the topic is created (apache#13104)
  KAFKA-14601: Improve exception handling in KafkaEventQueue apache#13089
  KAFKA-14367; Add `OffsetCommit` to the new `GroupCoordinator` interface (apache#12886)
  KAFKA-14530: Check state updater more often (apache#13017)
  KAFKA-14304 Use boolean for ZK migrating brokers in RPC/record (apache#13103)
  KAFKA-14003 Kafka Streams JUnit4 to JUnit5 migration part 2 (apache#12301)
  KAFKA-14607: Move Scheduler/KafkaScheduler to server-common (apache#13092)
  KAFKA-14367; Add `OffsetFetch` to the new `GroupCoordinator` interface (apache#12870)
  KAFKA-14557; Lock metadata log dir (apache#13058)
  MINOR: Implement toString method for TopicAssignment and PartitionAssignment (apache#13101)
  KAFKA-12558: Do not prematurely mutate internal partition state in Mirror Maker 2 (apache#11818)
  KAFKA-14540: Fix DataOutputStreamWritable#writeByteBuffer (apache#13032)
  KAFKA-14600: Reduce flakiness in ProducerIdExpirationTest (apache#13087)
  ...
guozhangwang pushed a commit to guozhangwang/kafka that referenced this pull request Jan 25, 2023
This change makes sure that Kafka grabs a log dir lock in the following additional cases:

1. When a Kafka node runs in controller only. The current implementation doesn't grab a file lock because the LogManager is never instantiated.
2. When the metadata log dir is different from the log dir(s). The current implementation of LogManager doesn't load or grab a lock on the metadata dir.

Reviewers: Ron Dagostino <rdagostino@confluent.io> , dengziming <dengziming1993@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants