Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-38692] Add extra logging to help diagnosing Thread spike. #116

Merged
merged 2 commits into from
Oct 25, 2016

Conversation

jtnord
Copy link
Member

@jtnord jtnord commented Oct 6, 2016

JENKINS-38692 Add extra logging statements to try and get to the root cause
of the IOHub thread spike.

@reviewbybees esp @stephenc

Add extra logging statements to try and get to the root cause
of the IOHub thread spike.
@jtnord
Copy link
Member Author

jtnord commented Oct 6, 2016

pending running through the stress test benchmark to check for performance regressions.

@jtnord
Copy link
Member Author

jtnord commented Oct 6, 2016

Stress test results
Fix uses more RAM, but has a lower VM load.

@@ -292,6 +292,10 @@ public final void removeInterestConnect(SelectionKey key) {
* @param key the key.
*/
public final void addInterestRead(SelectionKey key) {
if (LOGGER.isLoggable(Level.FINEST)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you were going to pull this out to a class level constant initialized on classloading so that hotspot can reduce to no-op

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't have added the if here. It's adding verbosity for no reason in that case. I suppose you added that because of the TODO below...

Copy link
Member Author

@jtnord jtnord Oct 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephenc so the performance without inlining didn't seem to bad - and if we leave it as is then you can use the standard Jenkins UI (or JMX for agents) to change the logging level

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you added that because of the TODO below...

yes - easier to do it now and know the performance hit - also prevents someone forgetting to do it later.

@stephenc
Copy link
Member

stephenc commented Oct 6, 2016

Single measurement results

agent-load-test --protocol JNLP4-plaintext --warmup 60 --collect 180 on OS-X, java version "1.8.0_102"

CPU load before: 1.53
CPU load after: 1.49

Avg Memory before: 195592±43263
Avg Memory after: 190434±41279

No significant difference observed... I'll need to collect a few more runs to get some measure on CPU load effects

@oleg-nenashev
Copy link
Member

I think this PR is fine. More diagnostics rarely hurts, and additional condition check should not cause a significant performance degradation. So 👍 form me. BTW, it's going into remoting 3.1+ IMHO

@amuniz
Copy link
Member

amuniz commented Oct 10, 2016

👍

@oleg-nenashev
Copy link
Member

Retriggering

Copy link
Member

@batmat batmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐝

@@ -292,6 +292,10 @@ public final void removeInterestConnect(SelectionKey key) {
* @param key the key.
*/
public final void addInterestRead(SelectionKey key) {
if (LOGGER.isLoggable(Level.FINEST)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't have added the if here. It's adding verbosity for no reason in that case. I suppose you added that because of the TODO below...

@@ -480,7 +496,11 @@ public final void run() {
* Process the scheduled tasks list.
*/
private void processScheduledTasks() {
if (scheduledTasks.size() > 4) {
final int tasksWaiting = scheduledTasks.size();
if (LOGGER.isLoggable(Level.FINEST)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test saves auto-boxing and GC churn

@stephenc
Copy link
Member

@batmat those if statements are critical for performance as this is hot code and they avoid constructing the new Object[]

@batmat
Copy link
Member

batmat commented Oct 15, 2016

@stephenc my comment wasn't blindly targeting all ifs. I'm aware of what you say. The ones I commented on weren't constructing objects. Most here actually currently only pass references and don't construct anything.

@stephenc
Copy link
Member

Still having to set up the stack frame to pass through... the testing I did on a few of these showed a negative impact where the if was removed... there was negligible difference vs where I moved the LOGGER.isLoggable(Level.FINEST) to a class level final boolean though

@jtnord
Copy link
Member Author

jtnord commented Oct 24, 2016

@reviewbybees done - @oleg-nenashev over to you.

@oleg-nenashev
Copy link
Member

is it still WiP? I doubt

@oleg-nenashev oleg-nenashev merged commit 46c72a5 into jenkinsci:master Oct 25, 2016
oleg-nenashev added a commit to oleg-nenashev/jenkins that referenced this pull request Nov 9, 2016
The change introduces one serious bugfix (JENKINS-39596) and a bunch of various diagnostics improvements.

Bugfixes:

* [JENKINS-39596](https://issues.jenkins-ci.org/browse/JENKINS-39596) -
Jenkins URL in `hudson.remoting.Engine` was always `null` since `3.0`.
It was causing connection failures of Jenkins JNLP agents when using Java Web Start.
([PR jenkinsci#131](jenkinsci/remoting#131))
* [JENKINS-39617](https://issues.jenkins-ci.org/browse/JENKINS-39617) -
`hudson.remoting.Engine` was failing to establish connection if one of the URLs parameter in parameters was malformed.
([PR jenkinsci#131](jenkinsci/remoting#131))

Improvements:

* [JENKINS-39150](https://issues.jenkins-ci.org/browse/JENKINS-39150) -
Add logic for dumping diagnostics across all the channels.
([PR jenkinsci#122](jenkinsci/remoting#122), [PR jenkinsci#125](jenkinsci/remoting#125))
* [JENKINS-39543](https://issues.jenkins-ci.org/browse/JENKINS-39543) -
Improve the caller/callee correlation diagnostics in thread dumps.
([PR jenkinsci#119](jenkinsci/remoting#119))
* [JENKINS-39290](https://issues.jenkins-ci.org/browse/JENKINS-39290) -
Add the `org.jenkinsci.remoting.nio.NioChannelHub.disabled` flag for disabling NIO (mostly for debugging purposes).
([PR jenkinsci#123](jenkinsci/remoting#123))
* [JENKINS-38692](https://issues.jenkins-ci.org/browse/JENKINS-38692) -
Add extra logging to help diagnosing `IOHub` Thread spikes.
([PR #116](jenkinsci/remoting#116))
* [JENKINS-39289](https://issues.jenkins-ci.org/browse/JENKINS-39289) -
 When a proxy fails, report what caused the channel to go down.
([PR jenkinsci#128](jenkinsci/remoting#128))
oleg-nenashev added a commit to jenkinsci/jenkins that referenced this pull request Nov 10, 2016
The change introduces one serious bugfix (JENKINS-39596) and a bunch of various diagnostics improvements.

Bugfixes:

* [JENKINS-39596](https://issues.jenkins-ci.org/browse/JENKINS-39596) -
Jenkins URL in `hudson.remoting.Engine` was always `null` since `3.0`.
It was causing connection failures of Jenkins JNLP agents when using Java Web Start.
([PR #131](jenkinsci/remoting#131))
* [JENKINS-39617](https://issues.jenkins-ci.org/browse/JENKINS-39617) -
`hudson.remoting.Engine` was failing to establish connection if one of the URLs parameter in parameters was malformed.
([PR #131](jenkinsci/remoting#131))

Improvements:

* [JENKINS-39150](https://issues.jenkins-ci.org/browse/JENKINS-39150) -
Add logic for dumping diagnostics across all the channels.
([PR #122](jenkinsci/remoting#122), [PR #125](jenkinsci/remoting#125))
* [JENKINS-39543](https://issues.jenkins-ci.org/browse/JENKINS-39543) -
Improve the caller/callee correlation diagnostics in thread dumps.
([PR #119](jenkinsci/remoting#119))
* [JENKINS-39290](https://issues.jenkins-ci.org/browse/JENKINS-39290) -
Add the `org.jenkinsci.remoting.nio.NioChannelHub.disabled` flag for disabling NIO (mostly for debugging purposes).
([PR #123](jenkinsci/remoting#123))
* [JENKINS-38692](https://issues.jenkins-ci.org/browse/JENKINS-38692) -
Add extra logging to help diagnosing `IOHub` Thread spikes.
([PR #116](jenkinsci/remoting#116))
* [JENKINS-39289](https://issues.jenkins-ci.org/browse/JENKINS-39289) -
 When a proxy fails, report what caused the channel to go down.
([PR #128](jenkinsci/remoting#128))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants