-
Notifications
You must be signed in to change notification settings - Fork 4.1k
STORM-2675: Fix storm-kafka-client Trident spout failing to serialize meta objects to Zookeeper #2271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Also stumbled on an issue with the way the coordinator and spout communicate, which I've put here https://issues.apache.org/jira/browse/STORM-2691. |
f04bac6 to
ef4fde4
Compare
|
@hmcl See my comment on that PR, I think the fixes are not related, but if we make the suggested changes we'll probably need to update this too. The changes should solve STORM-2691 though. |
revans2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srdo the changes look fine to me, but I think it would be cleaner if you could rebase this before I give a final +1.
| <artifactId>kafka-clients</artifactId> | ||
| <version>${storm.kafka.version}</version> | ||
| <scope>${provided.scope}</scope> | ||
| <scope>${kafka.dependency.scope}</scope> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just make this compile like the others? You can still override it in your jars if you want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, since storm-kafka-client-examples doesn't depend on storm-kafka-examples anymore this isn't important. I'll change it to compile in storm-kafka-client-examples too.
… meta objects to Zookeeper
revans2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still +1
…-2675 STORM-2675: Fix storm-kafka-client Trident spout failing to serialize meta objects to Zookeeper This closes #2271
See https://issues.apache.org/jira/browse/STORM-2675
This builds on #2268, so please ignore the first commit.
Trident uses json-simple under the hood to persist some objects to Zookeeper. This isn't mentioned on the API docs (or I missed it), so the current storm-kafka-client implementation returns a bunch of objects json-simple can't figure out how to serialize. The result is that json-simple writes the toString of the objects to Zookeeper, which can't be read back out. This causes the Trident Kafka spout to start over every time it's rebooted.
TransactionalState, which is used by Trident to read/write to/from Zookeeper, uses JSONValue.parse to read. That function fails quietly by returning null when there's a parsing error. There's a note in the code that we deliberately don't use the version of the parse function that throws exception on error, but we should at least log when it happens, since it's likely to be due to a bug in the spout or coordinator.
This PR makes the following changes:
If anyone has suggestions for tests of this, I'm happy to add some. I'm wondering why we don't use Kryo for serialization to Zookeeper, since the json-simple library is so inflexible (it can only handle some collections and primitive wrappers).