Skip to content

Conversation

@ooasis
Copy link

@ooasis ooasis commented Jul 26, 2016

This is to add the feature to 1.x-branch. The original PR for master branch is #728.

@ooasis
Copy link
Author

ooasis commented Jul 31, 2016

Performance test results are attached here.
perf_compare_netty_3vs4_1.x-branch.zip

Baseline: current head of 1.x-branch (e55684b)
Netty4: 57fbccc

In addition to default configuration,
io.netty.noPreferDirect=false
io.netty.allocator.type=pooled

following configurations are also tested for netty 4.x

-unpool
io.netty.allocator.type=unpooled. Ask Netty to not use pooled allocation for buffer

-nodirect
io.netty.noPreferDirect=true. Ask Netty to not use off-heap buffer allocation

-unpool-nodirect
io.netty.noPreferDirect=true
io.netty.allocator.type=unpooled

Due to the capacity of my laptop, I tested following loads: 2k, 5k, 10k. At 10k, the CPU usage reaches above 85% and failures started to appear at startup.

The performance data seems to suggest

  • no big difference is observed comparing before/after netty upgrades
    I suspect only a small percentage of traffic will cross worker instances (jvm) so Netty does not play a big role in performance measured
  • netty 4.x consumes a little more CPU and memory
    The small memory increment may be simply b/c netty 4.x pulled in more classes.
  • netty 4.x has lower CPU consumption on GC
    This is especially visible with the default configuration where off-heap buffer allocation is used.

@ooasis
Copy link
Author

ooasis commented Jul 31, 2016

Seems the Travis CI build will fail at random places which are not related to this PR.

For example the last build failed due to

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.2.1:process (default) on project flux-examples: Error resolving project artifact: Could not transfer artifact io.confluent:kafka-schema-registry-client:pom:1.0 from/to sonatype-apache (https://repository.apache.org/releases/): Connect to repository.apache.org:443 [repository.apache.org/207.244.88.143] failed: Connection timed out for project io.confluent:kafka-schema-registry-client:jar:1.0 -> [Help 1]

Any way to manually trigger a new CI build w/o code change?

-thanks

@HeartSaVioR
Copy link
Contributor

You can close and reopen this PR to retrigger. Btw, apache repository seems to be unstable recent days. I saw other build failures due to repository connectivity issue.

byte[] initialChallenge = saslNettyClient.saslResponse(new SaslMessageToken(new byte[0]));
LOG.debug("Sending initial challenge: {}", initialChallenge);
channel.write(new SaslMessageToken(initialChallenge));
channel.writeAndFlush(new SaslMessageToken(initialChallenge));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it changed to use writeAndFlush?

@satishd
Copy link
Member

satishd commented Aug 2, 2016

@hsun-cnnxty Channel#write is modified to Channel#writeAndFlush at multiple places. Does not that flush to the underlying stream without buffering and flush it when buffers are full?

@ooasis
Copy link
Author

ooasis commented Aug 3, 2016

@satishd nett 4.x has removed auto-flush in write() and it requires user to explicitly call flush() or writeAndFlush(), so we need to flush for the last message in the logic flow. Fortunately, the storm netty client does batching of messages itself and it does not flush for every message.

@ooasis ooasis closed this Aug 12, 2016
@ooasis ooasis reopened this Aug 12, 2016
@ooasis ooasis closed this Aug 18, 2016
@ooasis ooasis reopened this Aug 18, 2016
@isPositiveNumber
public static final String STORM_MESSAGING_NETTY_BUFFER_SIZE = "storm.messaging.netty.buffer_size";

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These config descriptions are not descriptive. Without reading the code, I have no idea what this does.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look at how to improve. Any suggestions?

-thanks

@HeartSaVioR
Copy link
Contributor

Since Netty made 3.x version line EOL, I'd like to bump Netty to 4.0.x or even 4.1.x if there's no performance / resource usage issue.

@hsun-cnnxty
Please upmerge this. If you really don't mind, could you do performance tests again on two latest versions: 4.0.41.Final and 4.1.5.Final, too?

@ooasis
Copy link
Author

ooasis commented Sep 23, 2016

I will find some time this weekend to work on it.

-thanks

@ooasis ooasis closed this Sep 24, 2016
@ooasis ooasis reopened this Sep 24, 2016
@ooasis ooasis closed this Sep 30, 2016
@ooasis ooasis reopened this Sep 30, 2016
@HeartSaVioR
Copy link
Contributor

Please rebase and force push accordingly. Your PR now has all of the commits from other side.

You can give it an another try: 1) create new branch based on recent master or 1.x 2) cherry-pick your commits 3) checkout to your PR branch 4) reset your PR branch to the branch which you just worked 4) force push.

@ooasis
Copy link
Author

ooasis commented Sep 30, 2016

@HeartSaVioR
I rebased to 1.x-branch and also squashed the commits. Will attached the performance test results soon.

@ooasis
Copy link
Author

ooasis commented Sep 30, 2016

Tests are done to compare following two code bases:

  • baseline: current head of 1.x-branch (62476f5)
  • netty 4.1.x: 1.x-branch-netty4 (dd96d9f)

In summary:

  • netty 4.1.x is able to sustain higher throughput (20k/sec) than baseline (between 15k -20k/sec)
  • netty 4.1.x has lower GC activity which could be the reason that it has higher throughput**

Also tested different netty configurations regarding buffer allocation

  • use direct buffer or heap
  • use pooled or unpooled

Result shows the default setting works best.

Performance tests are done on my laptop with following config:

Model Name: MacBook Pro
Processor Name: Intel Core i7
Processor Speed: 2.3 GHz
Number of Processors: 1
Total Number of Cores: 4
L2 Cache (per Core): 256 KB
L3 Cache: 6 MB
Memory: 16 GB
System Version: OS X 10.10.5 (14F27)
Kernel Version: Darwin 14.5.0

The data collected is attached.
performance_compare_netty_1.x-branch.zip

@HeartSaVioR
Copy link
Contributor

@hsun-cnnxty
Sorry I lost tracking this.
From the last time I tested on my local machine (via ThroughputVsLatency), before patching this works better than after patching this. My test environment seems not stable.
Someone might need to test on this, and if he/she can run benchmark with multi machines that should be great.

d2r pushed a commit to d2r/storm that referenced this pull request Oct 16, 2018
We are closing stale Pull Requests to make the list more manageable.

Please re-open any Pull Request that has been closed in error.

Closes apache#608
Closes apache#639
Closes apache#640
Closes apache#648
Closes apache#662
Closes apache#668
Closes apache#692
Closes apache#705
Closes apache#724
Closes apache#728
Closes apache#730
Closes apache#753
Closes apache#803
Closes apache#854
Closes apache#922
Closes apache#986
Closes apache#992
Closes apache#1019
Closes apache#1040
Closes apache#1041
Closes apache#1043
Closes apache#1046
Closes apache#1051
Closes apache#1078
Closes apache#1146
Closes apache#1164
Closes apache#1165
Closes apache#1178
Closes apache#1213
Closes apache#1225
Closes apache#1258
Closes apache#1259
Closes apache#1268
Closes apache#1272
Closes apache#1277
Closes apache#1278
Closes apache#1288
Closes apache#1296
Closes apache#1328
Closes apache#1342
Closes apache#1353
Closes apache#1370
Closes apache#1376
Closes apache#1391
Closes apache#1395
Closes apache#1399
Closes apache#1406
Closes apache#1410
Closes apache#1422
Closes apache#1427
Closes apache#1443
Closes apache#1462
Closes apache#1468
Closes apache#1483
Closes apache#1506
Closes apache#1509
Closes apache#1515
Closes apache#1520
Closes apache#1521
Closes apache#1525
Closes apache#1527
Closes apache#1544
Closes apache#1550
Closes apache#1566
Closes apache#1569
Closes apache#1570
Closes apache#1575
Closes apache#1580
Closes apache#1584
Closes apache#1591
Closes apache#1600
Closes apache#1611
Closes apache#1613
Closes apache#1639
Closes apache#1703
Closes apache#1711
Closes apache#1719
Closes apache#1737
Closes apache#1760
Closes apache#1767
Closes apache#1768
Closes apache#1785
Closes apache#1799
Closes apache#1822
Closes apache#1824
Closes apache#1844
Closes apache#1874
Closes apache#1918
Closes apache#1928
Closes apache#1937
Closes apache#1942
Closes apache#1951
Closes apache#1957
Closes apache#1963
Closes apache#1964
Closes apache#1965
Closes apache#1967
Closes apache#1968
Closes apache#1971
Closes apache#1985
Closes apache#1986
Closes apache#1998
Closes apache#2031
Closes apache#2032
Closes apache#2071
Closes apache#2076
Closes apache#2108
Closes apache#2119
Closes apache#2128
Closes apache#2142
Closes apache#2174
Closes apache#2206
Closes apache#2297
Closes apache#2322
Closes apache#2332
Closes apache#2341
Closes apache#2377
Closes apache#2414
Closes apache#2469
d2r pushed a commit to d2r/storm that referenced this pull request Oct 16, 2018
We are closing stale Pull Requests to make the list more manageable.

Please re-open any Pull Request that has been closed in error.

Closes apache#608
Closes apache#639
Closes apache#640
Closes apache#648
Closes apache#662
Closes apache#668
Closes apache#692
Closes apache#705
Closes apache#724
Closes apache#728
Closes apache#730
Closes apache#753
Closes apache#803
Closes apache#854
Closes apache#922
Closes apache#986
Closes apache#992
Closes apache#1019
Closes apache#1040
Closes apache#1041
Closes apache#1043
Closes apache#1046
Closes apache#1051
Closes apache#1078
Closes apache#1146
Closes apache#1164
Closes apache#1165
Closes apache#1178
Closes apache#1213
Closes apache#1225
Closes apache#1258
Closes apache#1259
Closes apache#1268
Closes apache#1272
Closes apache#1277
Closes apache#1278
Closes apache#1288
Closes apache#1296
Closes apache#1328
Closes apache#1342
Closes apache#1353
Closes apache#1370
Closes apache#1376
Closes apache#1391
Closes apache#1395
Closes apache#1399
Closes apache#1406
Closes apache#1410
Closes apache#1422
Closes apache#1427
Closes apache#1443
Closes apache#1462
Closes apache#1468
Closes apache#1483
Closes apache#1506
Closes apache#1509
Closes apache#1515
Closes apache#1520
Closes apache#1521
Closes apache#1525
Closes apache#1527
Closes apache#1544
Closes apache#1550
Closes apache#1566
Closes apache#1569
Closes apache#1570
Closes apache#1575
Closes apache#1580
Closes apache#1584
Closes apache#1591
Closes apache#1600
Closes apache#1611
Closes apache#1613
Closes apache#1639
Closes apache#1703
Closes apache#1711
Closes apache#1719
Closes apache#1737
Closes apache#1760
Closes apache#1767
Closes apache#1768
Closes apache#1785
Closes apache#1799
Closes apache#1822
Closes apache#1824
Closes apache#1844
Closes apache#1874
Closes apache#1918
Closes apache#1928
Closes apache#1937
Closes apache#1942
Closes apache#1951
Closes apache#1957
Closes apache#1963
Closes apache#1964
Closes apache#1965
Closes apache#1967
Closes apache#1968
Closes apache#1971
Closes apache#1985
Closes apache#1986
Closes apache#1998
Closes apache#2031
Closes apache#2032
Closes apache#2071
Closes apache#2076
Closes apache#2108
Closes apache#2119
Closes apache#2128
Closes apache#2142
Closes apache#2174
Closes apache#2206
Closes apache#2297
Closes apache#2322
Closes apache#2332
Closes apache#2341
Closes apache#2377
Closes apache#2414
Closes apache#2469
@asfgit asfgit closed this in #2880 Oct 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants