Skip to content

Comments

KAFKA-10545: Create topic IDs and propagate to brokers#9626

Merged
rajinisivaram merged 16 commits intoapache:trunkfrom
jolshan:KIP516LeaderAndIsr
Dec 18, 2020
Merged

KAFKA-10545: Create topic IDs and propagate to brokers#9626
rajinisivaram merged 16 commits intoapache:trunkfrom
jolshan:KIP516LeaderAndIsr

Conversation

@jolshan
Copy link
Member

@jolshan jolshan commented Nov 19, 2020

This change takes the topic IDs created in #9473 and propagates them to brokers using LeaderAndIsr Request. It also removes the topic name from the LeaderAndIsr Response, reorganizes the response to be sorted by topic, and includes the topic ID.

In addition, the topic ID is persisted to each replica in Log as well as in a file on disk. This file is read on startup and if the topic ID exists, it will be reloaded.

This PR bumps the IBP and is expected to be merged at the same time as #9622 as to not bump the protocol twice

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Copy link
Member

@dengziming dengziming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ApiVersionTest need to be changed because you changed ApiVersion, and a few suggesions.

@dengziming
Copy link
Member

KAFKA-10729 add KAFKA_2_8_IV0, so you should bump the version again.

Copy link
Contributor

@rite2nikhil rite2nikhil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the pr, left a few commrnts

Copy link
Member

@dajac dajac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I have left some comments. I have only looked at the request/response and the controller so far. I will look at the remaining parts tomorrow.

offlineReplicas += tp
else if (partition.errorCode == Errors.NONE.code)
onlineReplicas += tp
if (leaderAndIsrResponse.topics().isEmpty) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be better to also have an explicit handling of the version here. Alternatively, we could push this into the LeaderAndIsrResponse and provides a method Map<TopicPartition, ...> partitions() which handles the version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this may not work as we don't have the topic name in the latest version...

Copy link
Member Author

@jolshan jolshan Dec 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. One option I thought of would be to do something like partitions(Map<id, string> topicNames) and handle it inside the response with the version. That might be a little cleaner but I'm not sure.

Copy link
Contributor

@rajinisivaram rajinisivaram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan Thanks for the PR, looks good. Left some minor comments.

}

leaderAndIsrResponse.topics.forEach { topic =>
val topicName = controllerContext.topicNames.get(topic.topicId).get
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do controllerContext.topicNames.get(topic.topicId).foreach instead of get to avoid throwing exception if topic is not in the context?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this logic to LeaderAndIsrResponse which is java code, so it behaves a bit differently. (We may want to move back though, and if so, I'll do this.) For now I do a null check in the java code.

One question I have though is what does it mean to not have a topic in the context? Is it possible to send a leaderAndIsrRequest with an unknown partition/one not in the context? I can only think maybe the topic gets deleted while handling the request. What would the normal error handling case be in that scenario?

log.partitionMetadataFile.get.write(topicIds.get(topicPartition.topic))
log.topicId = topicIds.get(topicPartition.topic)
} else {
stateChangeLogger.warn("Partition metadata file already contains content.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this a warning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if we reach here, we are in an unexpected state. The partitionMetadata file should have loaded in the topic ID if the file already contained content. I left as a warn rather than an exception.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looking at the conditional statements here, it looks like we would write the file the first time we get here because log.partitionMetadataFile.get.isEmpty() and the second time we would print a warning even if the id in the file matches the expected id. Unless I missed something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I think I cleaned up this block and deleted something. There should be a check if log.topicId.equals(id). If so, then the file exists and we shouldn't go in to the block that says "// There is not yet a topic ID stored in the log."

I should also fix the topicIds.get(topicPartition.topic) above and replace with id.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might think about making this code cleaner in general to avoid so many nested if statements

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I thought through these cases some more and realized that the metadata file will fail to open if formatted incorrectly. So the only case where there could be data written to the file is if the ID is the zero UUID. So I decided to just fail on reading the file if the zero ID is provided. (We will never write zero ID to file.) The rest of this cleaned up pretty nicely.

Copy link
Contributor

@rajinisivaram rajinisivaram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan Thanks for the updates, looks good, just one comment/question left. The changes to process errors in LeaderAndIsrResponse.java seem fine. As you said, the only case where topic is not in the controller context would be where a topic was deleted. Don't think we need any other special handling for that. The PR needs rebasing to resolve conflicts btw.

log.partitionMetadataFile.get.write(topicIds.get(topicPartition.topic))
log.topicId = topicIds.get(topicPartition.topic)
} else {
stateChangeLogger.warn("Partition metadata file already contains content.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looking at the conditional statements here, it looks like we would write the file the first time we get here because log.partitionMetadataFile.get.isEmpty() and the second time we would print a warning even if the id in the file matches the expected id. Unless I missed something.

@jolshan
Copy link
Member Author

jolshan commented Dec 16, 2020

@rajinisivaram Yup, will look at the rebase next.

Copy link
Contributor

@rajinisivaram rajinisivaram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan Thanks for the updates. Left a minor comment, apart from that LGTM. PR #9622 is also ready to merge, so we can merge these two together later today.

@rajinisivaram
Copy link
Contributor

@jolshan Thanks for the PR, the last build was good, so merging to trunk

@rajinisivaram rajinisivaram merged commit 1dd1e7f into apache:trunk Dec 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants