Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update process stats #12043

Merged
merged 1 commit into from
Jul 8, 2015
Merged

Update process stats #12043

merged 1 commit into from
Jul 8, 2015

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Jul 6, 2015

This pull request:

  • removes the sigar-specific process stats: process.mem.resident, process.mem.share
  • add default implementation based on JMX beans for all other process stats
  • make ProcessProbe a singleton so that it can be loaded before security manager
  • move some process fields from Nodes Info to Nodes Stats API (max file descriptors, mlockall)
  • add documentation for process stats & process info

@tlrx
Copy link
Member Author

tlrx commented Jul 6, 2015

I'm updating the Nodes Stats and Nodes Info API piece by piece... Here is the process part.

@kimchy @rmuir @clintongormley Can you have a look please? thanks.

@tlrx tlrx mentioned this pull request Jul 7, 2015
} catch (Exception e) {
// not available
}
getMaxFileDescriptorCountField = method;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment from my OS level review, extract it to a method to make sure we don't leak the method down the code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@kimchy
Copy link
Member

kimchy commented Jul 7, 2015

left a couple of notes, more thoughts:

  • I think we should keep the mlockall flag in ProcessInfo.
  • Only have getters in Info/Stats
  • I suggest we skip getting the user/sys cpu time for the process using threads, I am not terribly sure about it, we can alway add it afterwards?

@@ -143,14 +143,22 @@ public boolean handle(int code) {
StringHelper.randomId();
}

public static void initializeProbes() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be package private? I think its not needed except by bootstrap and bootstrap for testing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@tlrx
Copy link
Member Author

tlrx commented Jul 8, 2015

@kimchy @rmuir I updated the code with your comments. Can you please look again? thanks.

I suggest we skip getting the user/sys cpu time for the process using threads, I am not terribly sure about it, we can alway add it afterwards?

I removed process cpu user/sys time. I also added a ProcessProbeBenchmark class which produces the following results for 100 000 iterations:

testing 'getOpenFileDescriptorCount' method...
total [1347] ms, avg [0.01347] ms

testing 'getMaxFileDescriptorCount' method...
total [12] ms, avg [1.2E-4] ms

testing 'getTotalVirtualMemorySize' method...
total [890] ms, avg [0.0089] ms

testing 'getProcessCpuPercent' method...
total [3955] ms, avg [0.03955] ms

testing 'getProcessCpuTotalTime' method...
total [44] ms, avg [4.4E-4] ms

calculating process CPU user time with 'getAllThreadIds + getThreadUserTime' methods...
execution time [total: 2880 ms, avg: 0.0288 ms] for 100000 iterations with average result of 2.5158615E9

calculating process CPU user time with 'getAllThreadIds + getThreadUserTime(long[])' methods...
execution time [total: 2763 ms, avg: 0.02763 ms] for 100000 iterations with average result of 3.4250589E9

Calculating user cpu time for the process using threads takes ~0.03 ms. The last benchmark test uses com.sun.management.ThreadMXBean to retrieve thread user times for all threads at once.

Method method = osMxBean.getClass().getDeclaredMethod(methodName);
method.setAccessible(true);
return method;
} catch (Exception e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we catch Throwable here and in the rest of the code in this class?

@kimchy
Copy link
Member

kimchy commented Jul 8, 2015

left a minor comment, LGTM otherwise

@tlrx tlrx force-pushed the update-process-stats branch from 0f4b55d to 1c5d8ef Compare July 8, 2015 13:13
@tlrx tlrx merged commit 1c5d8ef into elastic:master Jul 8, 2015
@kevinkluge kevinkluge removed the review label Jul 8, 2015
@tlrx
Copy link
Member Author

tlrx commented Jul 8, 2015

@kimchy thanks for the review! I merged with Throwable catch and updated documentation.

@rmuir you were interested in the time the probe takes on Linux: I added the ProcessProbeBenchmark class, feel free to comment here if needed.

@tlrx tlrx deleted the update-process-stats branch July 8, 2015 13:17
elapsed, (elapsed / (double)ITERATIONS), ITERATIONS, (sum / (double)ITERATIONS));

/* Commented as com.sun.management is listed as forbidden usage
if (threadMxBean instanceof com.sun.management.ThreadMXBean) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the SuppressForbidden annotation on top of the class to fix this.

@rmuir
Copy link
Contributor

rmuir commented Jul 8, 2015

Thanks for investigating the performance! I think the benchmark is useful because in the past, there has been issues with stats apis being on the slow side.

@kimchy
Copy link
Member

kimchy commented Jul 8, 2015

++, it helps us decide if we want to add it down the road

@tlrx tlrx mentioned this pull request Jul 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants