-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[STORM-1642] Catch Exception when deserialization failed. #1316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
How was deserialization failure resulting in NPE? Other than that, deserialization failure is the problem with an application itself which application should correct by registering correct serializer/deserializer. storm framework should not skip such tuples on its own. |
|
As I have mentioned in bug report, the reason why NPE happens is that storm is vulnerable to fraud message from processes outside the cluster. To reproduce the NPE, you just need to send a message [taskid 0] to [host]:[port] from anywhere, where taskid is id of one of the tasks running on [host]:[port]. In this case, storm will return a TaskMessage with payload set to null to the deserializer. |
|
I can't think of a reason of why would such a message be sent to storm from outside the cluster. Ideally only cluster machines and daemons should have access to the worker ports. Or if it indeeds needs to be solved, then a better method would be to ignore the zero length payload and not add |
|
The reason why such a message is sent is that our security team is scanning all the ports listened on each server and tries to detect potential weakness. |
| ;; null task ids are broadcast tuples | ||
| (fast-list-iter [task-id task-ids] | ||
| (tuple-action-fn task-id tuple) | ||
| )))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of parenthetical and formatting problems here.
|
Why not no both? I don't see any reason to propagate This is very much on the critical path, so someone else might want to weigh in. |
|
NPE is just a special case of deserialization failure, right? |
|
What do you mean? If you mean null payloads are a special case of failure, I disagree. In fact, it's not a failure at all. We received a null message, so we can perform a NOP as soon as possible. |
|
Yes, I agree with you that we should drop a task message with null payload as soon as possible, and I'll push another commit to drop task messages with null payload. |
|
|
||
| bout.writeShort((short)task_id); | ||
| if (payload_len == 0) { | ||
| LOG.warn("Zero length payload to task {}.", (short)task_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a debugging log? should remove them?
|
@liurenjie1024 |
* Expose OpenTsdbClient.Builder's constructor to public ** This allows Flux to initialize Builder instance which is needed for OpenTSDBBolt * Add constructor for having single mapper instance to OpenTsdbBolt ** Flux doesn't support array of reference for now * Also fix a bug on OpenTsdbBolt: it should refer the mapper interface for flexibility, not implemented one
* use bounded wildcard type to fix invariant issue
* introduce reflist * introduce BeanListReference which stores list of id of references * handle BeanListReference properly * modify unit test for testing reflist
…ith predictable ordering KafkaSpoutStreamsNamedTopics.getOutputFields() uses HashSet causes output fields with predictable ordering. So replaced with LinkedHashSet
… when a topology is killed * increase supervisor.worker.shutdown.sleep.secs to let workers kill themselves first even stuck * disable shutdown hook for log4j2 to make sure logs are written after shutdown is started
…ly by default the dir from the default config is log4j2 and that should be relative to STORM_HOME, however this is check was missing Signed-off-by: alexlehm <alexlehm@gmail.com>
… newline before the code quotes) Signed-off-by: alexlehm <alexlehm@gmail.com>
…is different from distcache-blobstore.md
…e when ZK nodes have already been created/deleted
We had the case that this broke encoding for us because it used the default system locale. Given the fact that this code uses an explicit encoding for the other case and that it evolved from something that always used an explicit encoding I believe this is more correct.
Code blocks should always follow an empty line; otherwise, jekyll will fail to properly format the code block. Also, github's fenced code blocks (with triple backticks) in the middle of item lists cause incorrect list numbering.
… STORM-2315-1.x-merge
ed2715c to
b9d2b37
Compare
[YSTORM-6088] Update mvn arguments to speed up builds
This patch aims to fix bug STORM-1642. When we failed to deserialize a tuple, we should skip this tuple and print a log rather than just shutdown the work.