Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add KafkaAdmin class #1540

Merged
merged 1 commit into from
Oct 25, 2018
Merged

Add KafkaAdmin class #1540

merged 1 commit into from
Oct 25, 2018

Conversation

llamahunter
Copy link
Contributor

@llamahunter llamahunter commented Jul 12, 2018

Requires cluster version > 0.10.0.0, and uses new wire protocol
classes to do many things via broker connection that previously
needed to be done directly in zookeeper.


This change is Reviewable

@llamahunter llamahunter mentioned this pull request Jul 12, 2018
@llamahunter
Copy link
Contributor Author

Not really a python pro, so please provide feedback on conventions, etc. Also, not really sure how to do extensive unit testing. The existing tests seem very integration-centric, and don't actually run successfully for me on my machine.

@llamahunter llamahunter force-pushed the add-admin-client branch 7 times, most recently from b046d25 to a6c7a2b Compare July 13, 2018 21:26
@dpkp
Copy link
Owner

dpkp commented Jul 14, 2018

This is great work! Thanks so much for submitting. My only top-level question is how do you think we should handle api version selection in a multi-version cluster? I.e., during a cluster upgrade. In the current non-admin case we provide a way for users to pass in an explicit "broker version". A more robust approach might be to always check the version of the broker we are sending the request to, but that gets significantly more complicated. Finally, maybe the last option is to advertise in the documentation that it is not safe to use while the cluster is in mixed-version / rolling upgrade state. What do you think?

@llamahunter
Copy link
Contributor Author

Well, I'm using the ApiVersions data structure returned by the broker to select the max version that both the library and the broker supports for each individual API call. I think hard coding that "0.11.0.2 means XYZ parameters on operations ABC are supported" is error prone, especially when the brokers already provide us with exactly what versions of each API call they support.

I think that if someone is doing an admin operation on a mixed version cluster, they should be sure that they don't use any of the optional parameters that aren't supported by all brokers in the cluster. Tho, I think that most (all?) admin calls go to the cluster controller, so it really only matters what that broker version is at.

@llamahunter
Copy link
Contributor Author

Hmm. I was fetching the ApiVersions map from whatever happened to be the 'least loaded node', but that's not necessarily the controller. All the admin calls go to the controller, so really need to check what ApiVersions it supports. Moved the check_version() call there.

@llamahunter
Copy link
Contributor Author

So, looking for some feedback on whether I should augment the API with return types for the calls. Right now, it's sort of just delivering back whatever Kafka wire protocol object. But, it looks like that's only got python attributes for the top level, and subobjects are just tuples or something, making them difficult to use. Thoughts? Is there a way to make the subobjects returned by the Schema parser have attributes? Should I unmarshal into typed python objects?

Copy link
Owner

@dpkp dpkp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Only issue is the enum requirement for py27. We should address that before merging.

Re: object types, that's a larger issue that we can leave for a separate PR. But you are exactly right -- making the response classes easier to work with directly would be very useful

@@ -0,0 +1,31 @@
from __future__ import absolute_import

from enum import IntEnum
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a pretty strict rule against mandatory external dependencies. I know this is standard in py3, but it isn't in py2. Can we avoid using it? If not, we should vendor the py27 version (see kafka/vendor)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh... people are still using python 2? Isn't it EOL? Hrmph... lemme look at it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any progress @llamahunter ?

And yes, I know of multiple companies who heavily rely on this library and still have large codebases in python 2. Most are planning to migrate to python 3, but still in the early stages. I'd guess it'll be another 2-3 years before I'd be okay with saying that we as a core infra library are going to stop supporting python 2 in our releases. I look forward to that day, it's just not quite here yet.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I'd love to see this be part of the project as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added vendoring of enum34 in #1604. To me, since enum is in the python 3 stdlib, it makes more sense to vendor it rather than to workaround it. That way we can start using it in other code as well. And we won't always be stuck with this because whenever we drop python 2 support we can stop vendoring it.

jeffwidman added a commit that referenced this pull request Oct 22, 2018
This is needed for #1540

While the usage there is trivial and could probably be worked around, I'd
rather vendor it so that future code can use enums... since `enum` is
already available in the python 3 stdlib, this will be easy enough to
eventually stop vendoring whenever we finally drop python 2 support.
@llamahunter
Copy link
Contributor Author

Sorry about the delay. Will look into this today. Still would like feedback about the return format. Should there be structured data structures rather than the loosely typed property bags? Is there some better way of using the Schema class to make the properties easier to access?

jeffwidman added a commit that referenced this pull request Oct 22, 2018
This is needed for #1540

While the usage there is trivial and could probably be worked around, I'd
rather vendor it so that future code can use enums... since `enum` is
already available in the python 3 stdlib, this will be easy enough to
eventually stop vendoring whenever we finally drop python 2 support.
jeffwidman added a commit that referenced this pull request Oct 22, 2018
This is needed for #1540

While the usage there is trivial and could probably be worked around, I'd
rather vendor it so that future code can use enums... since `enum` is
already available in the python 3 stdlib, this will be easy enough to
eventually stop vendoring whenever we finally drop python 2 support.
@jeffwidman
Copy link
Collaborator

jeffwidman commented Oct 22, 2018

I just added enum34 in #1604--use it as a fallback, something like this:

# enum in stdlib as of py3.4
try:
    import enum  # pylint: disable=import-error
except ImportError:
    # vendored backport module
    from kafka.vendor import enum34 as enum

Re: the schema stuff--good question that I haven't had time to look into yet... @dpkp / @tvoinarovskyi your thoughts?

Copy link
Owner

@dpkp dpkp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work! I think the structured data for Request/Response etc is an area ripe for improvement wrt property access. When I first put Struct/Schema together I left that as an exercise for later. Feel free to improve now or just ship it as is.

tox.ini Outdated
@@ -21,6 +21,7 @@ deps =
crc32c
py26: unittest2
decorator
py27: enum34
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is vendored, no need to include this

@jeffwidman
Copy link
Collaborator

@llamahunter let's just ship as is for now to land the existing changes and then leave figuring out a cleaner design for a later PR... I would not be surprised if we get a few bug reports about this, so I'd rather get it live in the hands of users on master sooner rather than later, even if the interface changes slightly in a later release.

Besides, with the 2.0 release impending once I get #1196, it will be an convenient place to do a breaking change in this interface if needed.

So if you can rebase and use the vendored enum34 when enum is not available, than I think that's all that is needed to get this merged.

@jeffwidman
Copy link
Collaborator

jeffwidman commented Oct 23, 2018

I'd like to land this, as I have some additional experiments I'd like to run on top of it, so I just rebased to fix conflicts and also added the vendored enum34 import... I think we will probably have at least one more 1.4.x bugfix release, and for those we can include this with the public interface marked as experimental...

So as soon as travis gives the green light I'm going to merge.

@llamahunter
Copy link
Contributor Author

Ah, sorry if I conflicted. I rebased and removed the tox.ini change. Had some weird local problems running the tests, tho. Hope it works on travis.

@jeffwidman
Copy link
Collaborator

No worries... I'd forgotten about the tox.ini change, and it looks like you forgot to fallback to enum34 on python 2, so I just updated it to pull in both changes.

I'm going to force-push over the top of your change. Also, for future reference, you can do git push --force-with-lease and you'll be notified if someone else changed the tip already.

@jeffwidman
Copy link
Collaborator

Also, I think the test failures are my fault from #1605 so I will get those fixed in a separate PR and then merge this...

Github was down, so travis wasn't running the tests correctly and I didn't realize it til now

@llamahunter
Copy link
Contributor Author

Ok... so, what needs doing here? Do I need to add

# enum in stdlib as of py3.4
try:
    import enum  # pylint: disable=import-error
except ImportError:
    # vendored backport module
    from kafka.vendor import enum34 as enum

for my use of IntEnum type?

@jeffwidman
Copy link
Collaborator

No, I already got that change in there. I was waiting until I had a chance to fix the pylint issue which was breaking tests... that is handled in #1609 / 0d766cf, so I just rebased and am just waiting for tests to pass and then I will merge this.

Requires cluster version > 0.10.0.0, and uses new wire protocol
classes to do many things via broker connection that previously
needed to be done directly in zookeeper.
@jeffwidman jeffwidman merged commit 481f880 into dpkp:master Oct 25, 2018
@jeffwidman
Copy link
Collaborator

jeffwidman commented Oct 25, 2018

Tests took forever to get green due to repeated flakiness upstream of travis--failed downloads etc.

But now we finally have this in! Thank you again. 🍰

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants