Skip to content

MINOR:Avoid slow Set.removeAll(List) in MirrorSourceConnector#13992

Merged
gharris1727 merged 2 commits intoapache:trunkfrom
hudeqi:fix_list_to_set
Jul 12, 2023
Merged

MINOR:Avoid slow Set.removeAll(List) in MirrorSourceConnector#13992
gharris1727 merged 2 commits intoapache:trunkfrom
hudeqi:fix_list_to_set

Conversation

@hudeqi
Copy link
Contributor

@hudeqi hudeqi commented Jul 11, 2023

similar to #13946. The refreshTopicPartitions method is called periodically. When the number of partitions involved in replication is large, performance problems may occur.

@hudeqi
Copy link
Contributor Author

hudeqi commented Jul 11, 2023

Please help to review this minor pr, thanks! @gharris1727

@gharris1727
Copy link
Contributor

Thanks for the fix @hudeqi ! I was wondering if you had plans to address this problem/warning elsewhere in the codebase?

@hudeqi
Copy link
Contributor Author

hudeqi commented Jul 12, 2023

Thanks for the fix @hudeqi ! I was wondering if you had plans to address this problem/warning elsewhere in the codebase?

I did a global search for the removeAll method of set in the codebase. There are indeed many set.removeAll(list), but this usage may have performance problems only when there are too many collection elements. At least in the connect module there is nothing extra except that mentioned in this PR.
In other modules, there may be this one that needs to be optimized to meet these conditions, which may not be suitable for mentioning in this PR (belonging to the MM2 tag). @gharris1727

@hudeqi hudeqi requested a review from gharris1727 July 12, 2023 03:21
Copy link
Contributor

@gharris1727 gharris1727 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @hudeqi for addressing this in both of the Mirror connectors!

@gharris1727
Copy link
Contributor

The flaky CI failures look unrelated and the mirror unit tests pass locally.

@gharris1727 gharris1727 merged commit 8d24716 into apache:trunk Jul 12, 2023
@hudeqi hudeqi deleted the fix_list_to_set branch July 12, 2023 23:54
Cerchie pushed a commit to Cerchie/kafka that referenced this pull request Jul 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants