-
Notifications
You must be signed in to change notification settings - Fork 1.4k
A simpler kafka consumer #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dpkp
commented
Sep 11, 2014
more details in docstring. general approach is to stick to the Java consumer configuration parameters and try to be as pythonic as possible otherwise. still needs a few more tweaks, but interested in feedback on approach / interface. |
``` | ||
# more advanced consumer -- multiple topics w/ auto commit offset management | ||
kafka = KafkaConsumer('topic1', 'topic2', | ||
group_id='my_consumer_group', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather take an actual dictionary of formal kafka properties here. This would enable us to (maybe, someday?) share consumer props with a java process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disagree. **kwargs seems to be the generally accepted pythonic interface. Plus if you wanted to load a dictionary of configs from a file or elsewhere, you could convert it to kwargs easily:
config = dict()
config.update(load_parameters_from_file())
kafka = KafkaConsumer('topic1', 'topic2', **config)
Unless there is something special about java properties (not really a dict?), I don't see the need to worry about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok; I'd like to add a function for converting formal java prop names to kwarg names then. This can be done at a later time.
I really like the approach and interface. Just to be clear: we are sacrificing the current ability to commit every N messages? There's probably more, but it's getting kinda late. :) |
also didn't see a config for auto commit by # of messages in the java api. It would be relatively easy to support. Perhaps |
I'm not 100% positive that we actually need it, but it should be pretty easy to add. That seems like a reasonable name. |
integration tests failing on python 3.3 and 3.4 -- will investigate |
""" | ||
Configuration settings can be passed to constructor, | ||
otherwise defaults will be used: | ||
client_id='kafka.consumer.kafka', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should be able to dynamically construct this format string from the actual defaults. That would be Much Better(tm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsure how to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I was thinking you could pass the defaults in as a dictionary and .format() them. However, that primarily serves the purpose of making sure runtime docs are good. Reading the docs at the source code level would suffer. :-/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do """default={key_name}""".format(**DEFAULT_CONSUMER_CONFIG) , but when i try to examine from shell via help(KafkaConsumer)
i see empty docstring... does this work for you? testing generically via python shell also doesn't work. have you gotten something like this to work before?
Seems worth merging to me. I really like the interface a lot more. |
thanks -- in the spirit of not merging my own PR, i'll leave it to you or mumrah |
I can't merge your PR, because there's a merge conflict. |
strange -- it's rebased to master/HEAD . manual steps yield no merge conflicts. boo on github. |
``` kafka = KafkaConsumer('topic1') for m in kafka: print m kafka = KafkaConsumer('topic1', 'topic2', group_id='my_consumer_group', auto_commit_enable=True, auto_commit_interval_ms=30 * 1000, auto_offset_reset='smallest') for m in kafka: process_message(m) kafka.task_done(m) ```
…ts(); update docstring
…tion list; add more type checks to set_topic_and_partitions
…group_id'; add docstring
…um_offsets var names
…r private methods
…titions configured
Log connection failures w/ traceback in kafka/client.py
… import from kafka at top-level
doh -- had to rebase against the offset PR that just landed. merge button should work now. |