Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use futures to parallelize calls to _send_request_to_node() #1807

Merged
merged 4 commits into from
May 21, 2019

Conversation

davidheitman
Copy link
Contributor

@davidheitman davidheitman commented May 18, 2019

Changes made as per the discussion in #1801
Uses futures to parallelize broker calls in the admin client.


This change is Reviewable

Copy link
Collaborator

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!

I'll be honest, I'd been avoiding doing this as I thought it was a much more complex change, but you did this in a very simple and clean manner, nice job.

A couple of minor cleanups and then this is ready to go.

kafka/admin/client.py Outdated Show resolved Hide resolved
kafka/admin/client.py Outdated Show resolved Hide resolved
kafka/admin/client.py Show resolved Hide resolved
if group_coordinator_id is not None:
this_groups_coordinator_id = group_coordinator_id
else:
this_groups_coordinator_id = self._find_group_coordinator_id(group_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you moved this under the version check? This seems generic code that's currently needed regardless of the version.

Side note: _find_group_coordinator_id() is a blocking call so this could be further optimized into firing off a bunch of futures for all the groups and then handling the responses but that would require breaking _find_group_coordinator_id() into two methods to generate request/parse the response... that optimization is probably best handled in different PR. Especially because that wouldn't even be used when fetching offsets for all consumer groups since the group coordinator is explicitly passed in already... example: https://github.com/DataDog/integrations-core/pull/2730/files#diff-deed983f73a04522ac621d7f6d0c8403R244

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figured it could be a minor optimization. We don't actually use this_groups_coordinator_id unless version <= 1 evaluates true, so why initialize it or call _find_group_coordinator_id() if we don't have to?

I'll peek at _find_group_coordinator_id() when I have a spare minute and see if I can optimize it. Agreed, something for another PR, but shouldn't be too bad.

@davidheitman
Copy link
Contributor Author

Thanks for the feedback and tips, I'll update the MR this evening and report back here.

@davidheitman
Copy link
Contributor Author

davidheitman commented May 21, 2019

OK, I had a quick moment, so I made the changes. The one thing I left for the moment is the code for this_groups_coordinator_id. Since it's only currently used in the one conditional, I figured for the moment it's more optimized to move it under that conditional. I can move it back if you want, just figured we could debate the merits a bit.

I'm assuming you want it back out in the event a future version conditional is added which will also need to reference this_groups_coordinator_id?

@jeffwidman
Copy link
Collaborator

The one thing I left for the moment is the code for this_groups_coordinator_id. Since it's only currently used in the one conditional, I figured for the moment it's more optimized to move it under that conditional.

Gotcha. From this section of code, that reasoning makes sense.

However, under the covers https://github.com/dpkp/kafka-python/pull/1807/files#diff-c893e6ff8a43607853864c15224c1802R627 will only blow up if we're talking to such a new broker that it drops support for the DescribeGroups_v1 request, which I think extremely unlikely to happen for years. And in that case, we should be fixing by adding support for DescribeGroups_v2 to support the new request style.

The other case is if we are communicating with an ancient broker that doesn't support the DescribeGroups request at all. However the KafkaAdminClient only guarantees support for clusters > 0.10. Additionally DescribeGroups has been around long enough that other code will blow up if we're talking to such an old cluster.

So while you're totally right that it's a nifty little micro-optimization (and I'd overlooked that earlier), in the broader picture the optimization is actually extremely unlikely to ever be hit.

I can move it back if you want, just figured we could debate the merits a bit.

Absolutely, happy to discuss these things anytime.

I'm assuming you want it back out in the event a future version conditional is added which will also need to reference this_groups_coordinator_id?

I think that quite likely. So having it DRY'd up is more likely to be of benefit than the optimization of putting it under the conditional.

At the end of the day, this one isn't a big deal to me--up to you. I am happy to merge either way, I just think it's cleaner how it already is.

Copy link
Collaborator

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Let me know what you decide about the conditional thing--I am willing to merge either way, and then I'll merge this.

@davidheitman
Copy link
Contributor Author

OK, I've moved it back to the original location outside the conditional. Agreed that it's a very minor optimization, costs outweighing the benefits and whatnot. I think once the tests pass, we should be good!

@jeffwidman jeffwidman merged commit f145e37 into dpkp:master May 21, 2019
@jeffwidman
Copy link
Collaborator

Merged, thank you again!

@davidheitman
Copy link
Contributor Author

Woot, and thank you! Pretty excited, my first commit to an OSS project that has nothing to do with either media centers or emulators. I'll peek at _find_group_coordinator_id() and see if I can't optimize that, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants