-
Notifications
You must be signed in to change notification settings - Fork 4.1k
STORM-2104: More graceful handling of acked/failed tuples after parti… #1696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The test failure is due to maven failing to download dependencies on storm-core. !storm-core passed. |
|
|
||
| //Emitted messages for partitions that are no longer assigned to this spout can't be acked, and they shouldn't be retried. Remove them from emitted. | ||
| Set<TopicPartition> partitionsSet = new HashSet(partitions); | ||
| emitted.removeIf((msgId) -> !partitionsSet.contains(msgId.getTopicPartition())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I think this same logic may be needed in onPartitionsRevoked as well. Also, I believe the message may need to be removed from the retryService as well. Please correct me if I am wrong!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The messages should be getting removed from retryService in line 156. It's my impression that onPartitionsAssigned will be getting called immediately after onPartitionsRevoked, before the current call to poll returns (see https://kafka.apache.org/090/javadoc/org/apache/kafka/clients/consumer/ConsumerRebalanceListener.html).
| /** | ||
| * @param <T> The type this deserializer deserializes to. | ||
| */ | ||
| public interface SerializableDeserializer<T> extends Deserializer<T>, Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this wrapper marking interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was nice to have, since setKey/ValueDeserializer in the builder implicitly requires the deserializer to be serializable. For example, if you try to set the standard Kafka StringDeserializer via those methods, you'll get a NotSerializableException when the topology is submitted to Storm, since they'll be set as fields on the final KafkaSpoutConfig field in KafkaSpout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think I am going to need to fix that on my patch.
|
@hmcl Sure thing. |
|
@hmcl ping. Had a chance to look at this? It would be nice to get merged soon. |
…tion reassignment in new Kafka spout
|
@revans2 ping for review if you have time. I'd like to get this in before too long if possible. |
|
This looks good to me. Now that I have gone through the kafka spout code for my other pull request I am confident in giving this a +1. |
|
Do you plan to port this to 1.0.x branch too? |
|
@qiozas If you really need it on 1.0.x, then I wouldn't mind porting it. It seems like 1.1.0 is right around the corner though (RCs are being tested), so it might be faster for you to upgrade when that comes out, since a backport would have to wait for another 1.0.x release? |
|
@srdo thank you very much for your offer. We have performed many tests with 1.0.2/3 release in order to be sure for any problems in migration from 0.9.6 to new version. Our major problem is the current one. Additionally, Storm 1.1.0 will be the first release with Kafka 0.10 API, so I am not very confident to use it in a production system. We prefer to use Kafka 0.10 API, but it is too new in Storm world (actually Storm hold us back to Kafka 0.9 API, but we can live for some time). |
|
@qiozas The Kafka 0.10 API changes were more or less a one-liner for the spout if I recall, so it shouldn't be a big risk to update. 0.9 and 0.10 have the same API. I'll take a look at backporting this soon. |
…tion reassignment in new Kafka spout
See https://issues.apache.org/jira/browse/STORM-2104
In order to test this change I added a factory for KafkaConsumers. Please let me know if there's a nicer way to mock it.
In addition to fixing the described issue, I changed a few types on KafkaSpoutConfig. If the user specifies a non-serializable Deserializer in either setter in KafkaSpoutConfig.Builder, the topology can't start because Nimbus can't serialize KafkaSpoutConfig.
I borrowed a few classes from #1679. I hope that's okay with you @jfenc91.