-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sending messages to new topics fails #150
Comments
This is the same as #113 , right? And it seems to have more activity, so maybe close this one? |
It seems like most of the extra activity on that has been on unrelated issues. In all fairness, I'm not sure that it's actually a good idea to fix this. It's very much better to precreate your topics so that you can specify the number of partitions, data retention policy, etc. |
However, it should be mentioned that there is actually a function in the test suite that does exactly this (ensures topic creation). However, it's horrifically slow and one of the major driving factors in why the test suite takes 5+ minutes. |
I think the correct behavior here is to raise LeaderNotAvailable or UnknownTopicOrPartitionError exception from send_messages. but I agree -- this is a duplicate of #113 , at least as I understand where that issue stands now. |
That should be exactly what's happening... ? I'll try to put together a test case that shows that behavior later today. That should be fairly simple. |
So, ok. I looked into this. I have a PR that throws UnknownTopicOrPartitionError, however it seems to me that this effectively disables the ability of the SimpleConsumer to autocreate topics. Is this intended? |
The current behaviour is that KeyError is being thrown from the line that calls "cycle", which is not good. Modifying the behaviour so that topics are not autocreated would be much, much worse, though. I think ideally I'd make the library wait (ie. make it transparent for the user of the library that Kafka does not send the correct metadata for auto-created topics right away), but throwing a LeaderNotAvailable or similar exception would be good, too. I prefer LeaderNotAvailable to UnknownTopicOrPartitionError because if I got the latter, I would assume the topic did not get created. |
So, the approach that I took in the PR above is as follows:
Alternatively:
I suppose it's possible to have done something like this in the PR (extreme pseudocode):
The thing that I don't like about that is that it assumes that the correct behavior is to create topics. In a land where deleting a topic is somewhere between hard and impossible (take down the cluster and perform surgery on all hosts and in zookeeper) then I'm pretty skeptical that auto-creation is the "safe" behavior. |
Ah, you're right about not assuming that the correct behaviour is to auto-create topics, I forgot it's just a setting, esp. with the funny zombie topic issue. What's the Kafka server behaviour when you ask for metadata for a non-existing topic, but auto-creation is off? I hope the response to the metadata request is NOT the same as when auto-creation is off. Assuming it's not the same, I think the best approach would be to raise LeaderNotAvailable when auto-creation is on, and UnknownTopicOrPartitionError when is off. |
There is no way to know the setting when asking for metadata. -Mark
|
I have a pretty strong preference for people knowing that they are creating topics. There is so much less control over the creation, partitioning, and settings of topics created this way. But even assuming I didn't believe that topic creation should be the exceptional case, I'm not really sure how I could even go about implementing what you want without just always trying to create and throwing a timeout error after a few minutes when the topic fails to create. :-/ -Mark
|
I'm not even sure what you're talking about anymore. Please look at the code: https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaApis.scala, particularly the method handleTopicMetadataRequest and getTopicMetadata that does top level work. It's really easy: if the topic is auto-created, then the response is LeaderNotAvailable until topic exists. If auto creation is not enabled, then the response is UnknownTopicOrPartition. No guessing of any kind is needed at all. When kafka replies LeaderNotAvailable, then you either need to wait until it is available or give up. If the error is UnknownTopicOrPartition then there's no such topic or partition, so there's nothing to wait for. |
How do you tell it where to send the produce request?
|
What is there to tell? In my view the problem is here: https://github.com/mumrah/kafka-python/blob/master/kafka/client.py#L245 And more generally here: https://github.com/mumrah/kafka-python/blob/master/kafka/protocol.py#L390 As you can see there's even a note about ignoring errors. It would be wrong to raise errors in protocol, since there may be many topics and only some of them might have errors. In my opinion those errors should be forwarded to callers (i.e. to client.py) and there it should decide: if it loads metadata for all topics (i.e. topics argument is empty), then it can ignore errors, it's preloading that metadata anyway. However, if it is loading metadata for specific topics, then it should do the following:
There's also an error for every topic, but I'm not sure what to do with those. |
Indeed -- omar and I discussed the fact that protocol.py currently silently drops topic and partition error codes. I had thought we opened an issue on that specifically, but looking again now I don't see one. It's definitely been on my short list of todos. To that end, to support topic errors we could set and then skip the partition metadata for that topic. |
I see what's up. I'll take a stab at updating my PR for this. |
Co-authored-by: Ryar Nyah <ryarnyah@gmail.com>
Co-authored-by: Ryar Nyah <ryarnyah@gmail.com>
When using send_messages with a topic name for the first time Kafka automatically creates the topic due to
load_metadata_for_topics
, however, metadata request fails because the newly created topic doesn't have a leader yet. Sincedecode_metadata_response
ignores errors this translates to topic not having partitions and a laterKeyError
because topic is not in dictionaries as a result.Is it possible to somehow handle
LeaderNotAvailable
by retrying several times and waiting for election when this happens?The text was updated successfully, but these errors were encountered: