Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending messages to new topics fails #150

Closed
snaury opened this issue Apr 1, 2014 · 16 comments · Fixed by #174 or aio-libs/aiokafka#985
Closed

Sending messages to new topics fails #150

snaury opened this issue Apr 1, 2014 · 16 comments · Fixed by #174 or aio-libs/aiokafka#985

Comments

@snaury
Copy link
Contributor

snaury commented Apr 1, 2014

When using send_messages with a topic name for the first time Kafka automatically creates the topic due to load_metadata_for_topics, however, metadata request fails because the newly created topic doesn't have a leader yet. Since decode_metadata_response ignores errors this translates to topic not having partitions and a later KeyError because topic is not in dictionaries as a result.

Is it possible to somehow handle LeaderNotAvailable by retrying several times and waiting for election when this happens?

@emanchado
Copy link

This is the same as #113 , right? And it seems to have more activity, so maybe close this one?

@wizzat
Copy link
Collaborator

wizzat commented May 22, 2014

It seems like most of the extra activity on that has been on unrelated issues. In all fairness, I'm not sure that it's actually a good idea to fix this. It's very much better to precreate your topics so that you can specify the number of partitions, data retention policy, etc.

@wizzat
Copy link
Collaborator

wizzat commented May 22, 2014

However, it should be mentioned that there is actually a function in the test suite that does exactly this (ensures topic creation). However, it's horrifically slow and one of the major driving factors in why the test suite takes 5+ minutes.

@dpkp
Copy link
Owner

dpkp commented May 22, 2014

I think the correct behavior here is to raise LeaderNotAvailable or UnknownTopicOrPartitionError exception from send_messages.

but I agree -- this is a duplicate of #113 , at least as I understand where that issue stands now.

@wizzat
Copy link
Collaborator

wizzat commented May 22, 2014

That should be exactly what's happening... ? I'll try to put together a test case that shows that behavior later today. That should be fairly simple.

@wizzat
Copy link
Collaborator

wizzat commented May 22, 2014

So, ok. I looked into this. I have a PR that throws UnknownTopicOrPartitionError, however it seems to me that this effectively disables the ability of the SimpleConsumer to autocreate topics. Is this intended?

wizzat added a commit to wizzat/kafka-python that referenced this issue May 22, 2014
Adds ensure_topic_exists to KafkaClient, redirects test case to use
that.  Fixes dpkp#113 and fixes dpkp#150.
@emanchado
Copy link

The current behaviour is that KeyError is being thrown from the line that calls "cycle", which is not good. Modifying the behaviour so that topics are not autocreated would be much, much worse, though.

I think ideally I'd make the library wait (ie. make it transparent for the user of the library that Kafka does not send the correct metadata for auto-created topics right away), but throwing a LeaderNotAvailable or similar exception would be good, too. I prefer LeaderNotAvailable to UnknownTopicOrPartitionError because if I got the latter, I would assume the topic did not get created.

@wizzat
Copy link
Collaborator

wizzat commented May 28, 2014

So, the approach that I took in the PR above is as follows:

client = KafkaClient(...)
producer = KafkaProducer(...)
client.ensure_topic_exists('my_new_topic')
producer.send_messages('my_new_topic', ...)
producer.send_messages('my_new_topic', ...)

Alternatively:

client = KafkaClient(...)
producer = KafkaProducer(...)
try:
    producer.send_messages('my_new_topic', ...)
except UnknownTopicOrPartitionError:
    producer.client.ensure_topic_exists('my_new_topic')
    producer.send_messages('my_new_topic', ...)

I suppose it's possible to have done something like this in the PR (extreme pseudocode):

def next_partition(topic):
    if not self.client.has_metadata_for_partition():
         self.client.ensure_topic_exists(topic)
    return random.choice(self.topic_partitions[topic])

The thing that I don't like about that is that it assumes that the correct behavior is to create topics. In a land where deleting a topic is somewhere between hard and impossible (take down the cluster and perform surgery on all hosts and in zookeeper) then I'm pretty skeptical that auto-creation is the "safe" behavior.

@emanchado
Copy link

Ah, you're right about not assuming that the correct behaviour is to auto-create topics, I forgot it's just a setting, esp. with the funny zombie topic issue.

What's the Kafka server behaviour when you ask for metadata for a non-existing topic, but auto-creation is off? I hope the response to the metadata request is NOT the same as when auto-creation is off.

Assuming it's not the same, I think the best approach would be to raise LeaderNotAvailable when auto-creation is on, and UnknownTopicOrPartitionError when is off.

@wizzat
Copy link
Collaborator

wizzat commented Jun 3, 2014

There is no way to know the setting when asking for metadata.

-Mark

On Jun 3, 2014, at 7:20, Esteban Manchado Velázquez notifications@github.com wrote:

Ah, you're right about not assuming that the correct behaviour is to auto-create topics, I forgot it's just a setting, esp. with the funny zombie topic issue.

What's the Kafka server behaviour when you ask for metadata for a non-existing topic, but auto-creation is off? I hope the response to the metadata request is NOT the same as when auto-creation is off.

Assuming it's not the same, I think the best approach would be to raise LeaderNotAvailable when auto-creation is on, and UnknownTopicOrPartitionError when is off.


Reply to this email directly or view it on GitHub.

@wizzat
Copy link
Collaborator

wizzat commented Jun 3, 2014

I have a pretty strong preference for people knowing that they are creating topics. There is so much less control over the creation, partitioning, and settings of topics created this way. But even assuming I didn't believe that topic creation should be the exceptional case, I'm not really sure how I could even go about implementing what you want without just always trying to create and throwing a timeout error after a few minutes when the topic fails to create. :-/

-Mark

On Jun 3, 2014, at 8:41, Mark Roberts wizzat@gmail.com wrote:

There is no way to know the setting when asking for metadata.

-Mark

On Jun 3, 2014, at 7:20, Esteban Manchado Velázquez notifications@github.com wrote:

Ah, you're right about not assuming that the correct behaviour is to auto-create topics, I forgot it's just a setting, esp. with the funny zombie topic issue.

What's the Kafka server behaviour when you ask for metadata for a non-existing topic, but auto-creation is off? I hope the response to the metadata request is NOT the same as when auto-creation is off.

Assuming it's not the same, I think the best approach would be to raise LeaderNotAvailable when auto-creation is on, and UnknownTopicOrPartitionError when is off.


Reply to this email directly or view it on GitHub.

@snaury
Copy link
Contributor Author

snaury commented Jun 3, 2014

I'm not even sure what you're talking about anymore. Please look at the code: https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaApis.scala, particularly the method handleTopicMetadataRequest and getTopicMetadata that does top level work. It's really easy: if the topic is auto-created, then the response is LeaderNotAvailable until topic exists. If auto creation is not enabled, then the response is UnknownTopicOrPartition.

No guessing of any kind is needed at all.

When kafka replies LeaderNotAvailable, then you either need to wait until it is available or give up.

If the error is UnknownTopicOrPartition then there's no such topic or partition, so there's nothing to wait for.

@wizzat
Copy link
Collaborator

wizzat commented Jun 3, 2014

How do you tell it where to send the produce request?

On Jun 3, 2014, at 9:28, Alexey Borzenkov notifications@github.com wrote:

I'm not even sure what you're talking about anymore. Please look at the code: https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaApis.scala, particularly the method handleTopicMetadataRequest and getTopicMetadata that does top level work. It's really easy: if the topic is auto-created, then the response is LeaderNotAvailable until topic exists. If auto creation is not enabled, then the response is UnknownTopicOrPartition.

No guessing of any kind is needed at all.

When kafka replies LeaderNotAvailable, then you either need to wait until it is available or give up.

If the error is UnknownTopicOrPartition then there's no such topic or partition, so there's nothing to wait for.


Reply to this email directly or view it on GitHub.

@snaury
Copy link
Contributor Author

snaury commented Jun 3, 2014

What is there to tell? In my view the problem is here:

https://github.com/mumrah/kafka-python/blob/master/kafka/client.py#L245

And more generally here:

https://github.com/mumrah/kafka-python/blob/master/kafka/protocol.py#L390

As you can see there's even a note about ignoring errors. It would be wrong to raise errors in protocol, since there may be many topics and only some of them might have errors. In my opinion those errors should be forwarded to callers (i.e. to client.py) and there it should decide: if it loads metadata for all topics (i.e. topics argument is empty), then it can ignore errors, it's preloading that metadata anyway. However, if it is loading metadata for specific topics, then it should do the following:

  • if topic has LeaderNotAvailable, then it could sleep a little and retry to load metadata for topics that don't have a leader yet, chances are it will finally succeed. Otherwise go to the next step.
  • if any topic has any non-zero error, it should raise a matching exception, so the error is not silently ignored. This would mean producer raises a sensible error, not KeyError.

There's also an error for every topic, but I'm not sure what to do with those.

@dpkp
Copy link
Owner

dpkp commented Jun 3, 2014

Indeed -- omar and I discussed the fact that protocol.py currently silently drops topic and partition error codes. I had thought we opened an issue on that specifically, but looking again now I don't see one. It's definitely been on my short list of todos.

To that end, to support topic errors we could set
topic_metadata[topic_name] = kafka_errors[topic_error]

and then skip the partition metadata for that topic.

@wizzat
Copy link
Collaborator

wizzat commented Jun 3, 2014

I see what's up. I'll take a stab at updating my PR for this.

@dpkp dpkp closed this as completed in #174 Aug 11, 2014
wbarnha added a commit to mattoberle/kafka-python that referenced this issue Mar 9, 2024
Co-authored-by: Ryar Nyah <ryarnyah@gmail.com>
ods added a commit to aio-libs/aiokafka that referenced this issue Mar 9, 2024
bradenneal1 pushed a commit to bradenneal1/kafka-python that referenced this issue May 16, 2024
Co-authored-by: Ryar Nyah <ryarnyah@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants