Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Add Window Bolt support #1215

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Add Window Bolt support #1215

wants to merge 3 commits into from

Conversation

windie
Copy link
Contributor

@windie windie commented Aug 4, 2016

This adds Window bolt support. Most of codes are from Apache Storm master branch (https://github.com/apache/storm/tree/v1.0.2).

This only includes minimum codes for Windowed Bolt. I plan to add others in follow-up PRs.

Resolves #1124.

@windie
Copy link
Contributor Author

windie commented Aug 4, 2016

@kramasamy - right now most of codes are from Strom's master branch. I'm not sure if this is the right base. Or should I use some stable version, such as v1.0.2 or v1.1.0? In addition, is there anything I can do to make this one easy to review?

@kramasamy
Copy link
Contributor

kramasamy commented Aug 4, 2016

@windie - I would suggest to use a stable version such as v1.0.1, then when we upgrade we can do the next version - it is very predictable.

@windie windie changed the title Add Window Bolt support [WIP] Add Window Bolt support Aug 4, 2016
@windie
Copy link
Contributor Author

windie commented Aug 4, 2016

@kramasamy sounds great. Will update my PR to base on v1.0.1.

@kramasamy
Copy link
Contributor

thanks @windie - really appreciate the contribution.

@kramasamy
Copy link
Contributor

kramasamy commented Aug 4, 2016

@windie - we can add you into our slack group of heron committers. But I don't know your email address.

@windie
Copy link
Contributor Author

windie commented Aug 4, 2016

@kramasamy - Thanks! My email is ******

@kramasamy
Copy link
Contributor

Invited - please remove your email from the comment.

@kramasamy
Copy link
Contributor

@windie - this should go into heron/storm - since this is storm api 1.0.0 compatible. Heron api under heron/api - is not supposed to be used by the users - since it is a hidden API.

@windie
Copy link
Contributor Author

windie commented Aug 8, 2016

@kramasamy - Gotcha.

@windie windie changed the title [WIP] Add Window Bolt support Add Window Bolt support Aug 15, 2016
@windie
Copy link
Contributor Author

windie commented Aug 15, 2016

The new files are copied from Storm. I created a diff between them to show the differences: https://gist.github.com/windie/28d84a7b00a587cd242ba55681df56dd

@@ -186,6 +190,14 @@ public int getThisTaskIndex() {
return getSources(getThisComponentId());
}
*/
public Set<GlobalStreamId> getThisSourceIds() {
Copy link
Contributor Author

@windie windie Aug 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the correct way.

Right now Heron doesn't have the Grouping class, so I created the method to get all source streams. It's used by WindowedBoltExecutor.

In Storm, WindowedBoltExecutor uses getThisSources (The above method commented out)

@windie
Copy link
Contributor Author

windie commented Aug 15, 2016

@kramasamy - updated. Please take a look.

@windie
Copy link
Contributor Author

windie commented Aug 16, 2016

I fixed the failure test in #1263

@kramasamy
Copy link
Contributor

@nlu90 @objmagic - can you co-review the PR and discuss the design?

@objmagic
Copy link
Contributor

@kramasamy will do when we have time...

@kramasamy
Copy link
Contributor

@maosongfu - can you review this?

@kramasamy
Copy link
Contributor

@windie - can you please add some detailed documentation for the types of windows supported? We want to verify and close this as soon as possible.

@kramasamy
Copy link
Contributor

@windie - how can we write unit tests for this?

@mycFelix
Copy link
Contributor

mycFelix commented Oct 7, 2016

@windie - Hi, so great to add Window Bolt support!
I checkouted PR and installed heron-storm on my local env. Then I run test-topology named SlidingWindowTopology given by Apache-Storm. The only thing I have changed in SlidingWindowTopology.java is following

 Config conf = new Config();
        conf.setDebug(true);
//        if (args != null && args.length > 0) {
//            conf.setNumWorkers(1);
//            StormSubmitter.submitTopologyWithProgressBar(args[0], conf, builder.createTopology());
//        } else {
//            LocalCluster cluster = new LocalCluster();
//            cluster.submitTopology("test", conf, builder.createTopology());
//            Utils.sleep(40000);
//            cluster.killTopology("test");
//            cluster.shutdown();
//        }

 StormSubmitter.submitTopology("test", conf, builder.createTopology());

I failed to run test-topology. Would you please take a look? And The error log is following:

[2016-10-07 20:19:04 +0800] com.twitter.heron.instance.HeronInstance ERROR:  Exception caught in thread: SlaveThread with id: 13 
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number
    at org.apache.storm.topology.WindowedBoltExecutor.initWindowManager(WindowedBoltExecutor.java:153)
    at org.apache.storm.topology.WindowedBoltExecutor.prepare(WindowedBoltExecutor.java:267)
    at org.apache.storm.topology.IRichBoltDelegate.prepare(IRichBoltDelegate.java:48)
    at com.twitter.heron.instance.bolt.BoltInstance.start(BoltInstance.java:110)
    at com.twitter.heron.instance.Slave.startInstance(Slave.java:173)
    at com.twitter.heron.instance.Slave.handleNewAssignment(Slave.java:159)
    at com.twitter.heron.instance.Slave.access$200(Slave.java:40)
    at com.twitter.heron.instance.Slave$1.run(Slave.java:85)
    at com.twitter.heron.common.basics.WakeableLooper.executeTasksOnWakeup(WakeableLooper.java:142)
    at com.twitter.heron.common.basics.WakeableLooper.runOnce(WakeableLooper.java:74)
    at com.twitter.heron.common.basics.WakeableLooper.loop(WakeableLooper.java:64)
    at com.twitter.heron.instance.Slave.run(Slave.java:169)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

I have a minor comment though. I think it will be great if we can add unit tests,example topology and doc for this.


private Set<GlobalStreamId> getComponentStreams(TopologyContext context) {
Set<GlobalStreamId> streams = new HashSet<>();
for (GlobalStreamId streamId : context.getThisSourceIds()) {
Copy link
Contributor

@mycFelix mycFelix Oct 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we should check streamId as what storm has done here. I'm afraid it will cause another problem at line. What do you think?

my bad, delete comment

@windie
Copy link
Contributor Author

windie commented Oct 8, 2016

Thanks for reviewing. I will add examples, docs and unit tests at the weekend.

@mycFelix
Copy link
Contributor

@windie - hi, any update on this?

@kramasamy
Copy link
Contributor

@windie - any update on this?

@windie
Copy link
Contributor Author

windie commented Oct 25, 2016

Sorry for the delay. Just added examples and unit tests to this PR.

Right now this PR supports Storm's IWindowedBolt and Watermark but not IStatefulWindowedBolt.

  • IWindowedBolt example: com.twitter.heron.examples.SlidingWindowTopology.
  • Watermark example: com.twitter.heron.examples.SlidingTupleTsTopology.

@windie
Copy link
Contributor Author

windie commented Oct 25, 2016

[2016-10-07 20:19:04 +0800] com.twitter.heron.instance.HeronInstance ERROR: Exception caught in thread: SlaveThread with id: 13
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number

@mycFelix fixed this error in my PR. You can just run com.twitter.heron.examples.SlidingWindowTopology and com.twitter.heron.examples.SlidingTupleTsTopology in this PR.

@mycFelix
Copy link
Contributor

mycFelix commented Oct 25, 2016

@windie - Great! I'll check it ASAP. BTW, free to ping me at SLACK group anytime.

Fix the style erorrs
* Emits a random integer and a timestamp value (offset by one day),
* every 100 ms. The ts field can be used in tuple time based windowing.
*/
@SuppressWarnings("serial")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of suppressing this here and elsewhere, can you add a serialVersionId?

@billonahill
Copy link
Contributor

If the storm code is mostly copied could you add a README.md file below heron/storm/src that explains what storm version we're using and how to diff against the storm repo to pull in future changes?

@mycFelix
Copy link
Contributor

mycFelix commented Oct 26, 2016

The bug mentioned before is fixed. 👍

@windie
Copy link
Contributor Author

windie commented Oct 27, 2016

What's the convention for codes copied from Storm? Most of codes are changed because of the style errors and different log codes, it's impossible to apply any patches from storm repo directly.

@maosongfu
Copy link
Contributor

@kramasamy @nlu90 @windie I will propose to put all of these on top of heron in another repo or in a folder exclude style checking.

@windie
Copy link
Contributor Author

windie commented Oct 27, 2016

How about just excluding style checking for //heron/storm?

@billonahill
Copy link
Contributor

+1 for excluding style checks for /heron/storm and syncing source code without style changes. This will make it easier to diff/patch withe the storm repo.

@kramasamy
Copy link
Contributor

ok sounds good. Can we do separate PR for excluding style checking?

@maosongfu
Copy link
Contributor

maosongfu commented Oct 27, 2016

@windie @kramasamy @billonahill It may not be a good idea to exclude everything in /heron/storm since there are still codes owned by us in this folder, especially for the parts converting storm into heron. Is it possible in the separate PR that we split this folder into two: one is for codes owned by us enforcing style checks, while another is for codes from outside excluding style checks.

@kramasamy
Copy link
Contributor

@maosongfu - any idea which directories under heron/storm is written by us? I can take care of separating it out.

@maosongfu
Copy link
Contributor

@kramasamy
Currently these are written by us:
heron/heron/storm/src/java/

  • backtype/storm
  • org/apache/storm

They are converting storm job into heron job.

@mycFelix
Copy link
Contributor

Hi,guys.
BTW, how do we deal with //contrib folder if it was copied from Apache-storm.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants