Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-17498: Reduce the number of remote calls when serving LIST_OFFSETS request #17132

Merged
merged 4 commits into from
Sep 30, 2024

Conversation

kamalcph
Copy link
Contributor

@kamalcph kamalcph commented Sep 9, 2024

If the index to be fetched exists locally, then avoid fetching the remote indexes to serve the LIST_OFFSETS request.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@kamalcph kamalcph added the tiered-storage Related to the Tiered Storage feature label Sep 9, 2024
Copy link
Member

@satishd satishd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kamalcph for the PR. Please resolve the conflicts.

@kamalcph kamalcph force-pushed the KAFKA-17498 branch 2 times, most recently from 57e9bd8 to 2daef27 Compare September 19, 2024 13:54
Copy link
Contributor

@clolov clolov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this and apologies for the delayed review!

Copy link
Contributor

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Overall LGTM. Please also update the javadoc of findOffsetByTimestamp. Thanks.

Copy link
Contributor

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a minor comment. Thanks for the patch.

@@ -698,14 +700,24 @@ public Optional<FileRecords.TimestampAndOffset> findOffsetByTimestamp(TopicParti
&& rlsMetadata.endOffset() >= startingOffset
&& isRemoteSegmentWithinLeaderEpochs(rlsMetadata, unifiedLog.logEndOffset(), epochWithOffsets)
&& rlsMetadata.state().equals(RemoteLogSegmentState.COPY_SEGMENT_FINISHED)) {
return lookupTimestamp(rlsMetadata, timestamp, startingOffset);
// cache to avoid race conditions
List<LogSegment> segmentsCopy = new ArrayList<>(unifiedLog.logSegments());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised an edge case here where we may miss returning the right segment while searching for the target timestamp. This can happen when the search operation moves from remote segments to local segments search but the local segment is moved to remote storage and deleted from local segments successfully before it is searched successfully.

Thanks @kamalcph for raising the KAFKA-17637 as we discussed offline. This can be taken up in a follow-up PR.

@satishd satishd merged commit 4036081 into apache:trunk Sep 30, 2024
9 checks passed
@kamalcph kamalcph deleted the KAFKA-17498 branch October 3, 2024 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Kafka Broker tiered-storage Related to the Tiered Storage feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants