Skip to content

[REF,FIX] Revision of the resource monitor #2285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Nov 29, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
e50226e
[FIX] Do not break when building profiling summary
oesteban Nov 14, 2017
d7a2d16
enable reuse of resource_monitor.json, write it in base_dir
oesteban Nov 15, 2017
58b4905
Merge branch 'fix/resource-monitor-file' into fix/resource-monitor-re…
oesteban Nov 15, 2017
75cb366
Merge remote-tracking branch 'upstream/master' into fix/resource-moni…
oesteban Nov 15, 2017
b72f9aa
allow disable resource_monitor appending with nipype option
oesteban Nov 15, 2017
cd46847
allow disable resource_monitor appending with nipype option (amend to…
oesteban Nov 15, 2017
8e79259
improving documentation of new config entry
oesteban Nov 15, 2017
afa63dc
add new [monitoring] section to nipype config
oesteban Nov 15, 2017
0100135
read new monitoring.summary_file option
oesteban Nov 15, 2017
b159b01
Merge remote-tracking branch 'upstream/master' into fix/resource-moni…
oesteban Nov 15, 2017
612ed03
fix tests
oesteban Nov 15, 2017
be7f6b4
[skip ci] clean header up, fix two pep8 warnings
oesteban Nov 15, 2017
425996e
Merge remote-tracking branch 'upstream/master' into fix/resource-moni…
oesteban Nov 21, 2017
87d1c7a
update CHANGES
oesteban Nov 21, 2017
97e186e
[skip ci] Merge remote-tracking branch 'upstream/master' into fix/res…
oesteban Nov 26, 2017
f90dfe2
[skip ci] Merge remote-tracking branch 'upstream/master' into fix/res…
oesteban Nov 27, 2017
7220ab3
[skip ci] Merge branch 'fix/resource-monitor-revisions' of github.com…
oesteban Nov 27, 2017
8d59408
fix indentation
oesteban Nov 28, 2017
404a1c9
Merge branch 'master' into fix/resource-monitor-revisions
oesteban Nov 29, 2017
a762317
[skip ci] Merge remote-tracking branch 'upstream/master' into fix/res…
oesteban Nov 29, 2017
ccb5403
[skip ci] Merge branch 'fix/resource-monitor-revisions' of github.com…
oesteban Nov 29, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Upcoming release

###### [Full changelog](https://github.com/nipy/nipype/milestone/13)

* FIX+MAINT: Revision of the resource monitor (https://github.com/nipy/nipype/pull/2285)
* FIX: MultiProc mishandling crashes (https://github.com/nipy/nipype/pull/2301)
* MAINT: Revise use of `subprocess.Popen` (https://github.com/nipy/nipype/pull/2289)
* ENH: Memorize version checks (https://github.com/nipy/nipype/pull/2274, https://github.com/nipy/nipype/pull/2295)
Expand Down
29 changes: 24 additions & 5 deletions doc/users/config_file.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,14 +153,29 @@ Execution
crashfiles allow portability across machines and shorter load time.
(possible values: ``pklz`` and ``txt``; default value: ``pklz``)

*resource_monitor*

Resource Monitor
~~~~~~~~~~~~~~~~

*enabled*
Enables monitoring the resources occupation (possible values: ``true`` and
``false``; default value: ``false``)
``false``; default value: ``false``). All the following options will be
dismissed if the resource monitor is not enabled.

*resource_monitor_frequency*
*sample_frequency*
Sampling period (in seconds) between measurements of resources (memory, cpus)
being used by an interface. Requires ``resource_monitor`` to be ``true``.
(default value: ``1``)
being used by an interface (default value: ``1``)

*summary_file*
Indicates where the summary file collecting all profiling information from the
resource monitor should be stored after execution of a workflow.
The ``summary_file`` does not apply to interfaces run independently.
(unset by default, in which case the summary file will be written out to
``<base_dir>/resource_monitor.json`` of the top-level workflow).

*summary_append*
Append to an existing summary file (only applies to workflows).
(default value: ``true``, possible values: ``true`` or ``false``).

Example
~~~~~~~
Expand All @@ -175,6 +190,10 @@ Example
hash_method = timestamp
display_variable = :1

[monitoring]
enabled = false


Workflow.config property has a form of a nested dictionary reflecting the
structure of the .cfg file.

Expand Down
5 changes: 3 additions & 2 deletions docker/files/run_examples.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,9 @@ echo '[execution]' >> ${HOME}/.nipype/nipype.cfg
echo 'crashfile_format = txt' >> ${HOME}/.nipype/nipype.cfg

if [[ "${NIPYPE_RESOURCE_MONITOR:-0}" == "1" ]]; then
echo 'resource_monitor = true' >> ${HOME}/.nipype/nipype.cfg
echo 'resource_monitor_frequency = 3' >> ${HOME}/.nipype/nipype.cfg
echo '[monitoring]' >> ${HOME}/.nipype/nipype.cfg
echo 'enabled = true' >> ${HOME}/.nipype/nipype.cfg
echo 'sample_frequency = 3' >> ${HOME}/.nipype/nipype.cfg
fi

# Set up coverage
Expand Down
4 changes: 2 additions & 2 deletions nipype/interfaces/tests/test_resource_monitor.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def test_cmdline_profiling(tmpdir, mem_gb, n_procs):
of a CommandLine-derived interface
"""
from nipype import config
config.set('execution', 'resource_monitor_frequency', '0.2') # Force sampling fast
config.set('monitoring', 'sample_frequency', '0.2') # Force sampling fast

tmpdir.chdir()
iface = UseResources(mem_gb=mem_gb, n_procs=n_procs)
Expand All @@ -72,7 +72,7 @@ def test_function_profiling(tmpdir, mem_gb, n_procs):
of a Function interface
"""
from nipype import config
config.set('execution', 'resource_monitor_frequency', '0.2') # Force sampling fast
config.set('monitoring', 'sample_frequency', '0.2') # Force sampling fast

tmpdir.chdir()
iface = niu.Function(function=_use_resources)
Expand Down
28 changes: 25 additions & 3 deletions nipype/pipeline/engine/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1298,15 +1298,24 @@ def write_workflow_prov(graph, filename=None, format='all'):
return ps.g


def write_workflow_resources(graph, filename=None):
def write_workflow_resources(graph, filename=None, append=None):
"""
Generate a JSON file with profiling traces that can be loaded
in a pandas DataFrame or processed with JavaScript like D3.js
"""
import simplejson as json

# Overwrite filename if nipype config is set
filename = config.get('monitoring', 'summary_file', filename)

# If filename still does not make sense, store in $PWD
if not filename:
filename = os.path.join(os.getcwd(), 'resource_monitor.json')

if append is None:
append = str2bool(config.get(
'monitoring', 'summary_append', 'true'))

big_dict = {
'time': [],
'name': [],
Expand All @@ -1317,14 +1326,21 @@ def write_workflow_resources(graph, filename=None):
'params': [],
}

# If file exists, just append new profile information
# If we append different runs, then we will see different
# "bursts" of timestamps corresponding to those executions.
if append and os.path.isfile(filename):
with open(filename, 'r' if PY3 else 'rb') as rsf:
big_dict = json.load(rsf)

for idx, node in enumerate(graph.nodes()):
nodename = node.fullname
classname = node._interface.__class__.__name__

params = ''
if node.parameterization:
params = '_'.join(['{}'.format(p)
for p in node.parameterization])
for p in node.parameterization])

try:
rt_list = node.result.runtime
Expand All @@ -1337,7 +1353,13 @@ def write_workflow_resources(graph, filename=None):
rt_list = [rt_list]

for subidx, runtime in enumerate(rt_list):
nsamples = len(runtime.prof_dict['time'])
try:
nsamples = len(runtime.prof_dict['time'])
except AttributeError:
logger.warning(
'Could not retrieve profiling information for node "%s" '
'(mapflow %d/%d).', nodename, subidx + 1, len(rt_list))
continue

for key in ['time', 'mem_gb', 'cpus']:
big_dict[key] += runtime.prof_dict[key]
Expand Down
6 changes: 5 additions & 1 deletion nipype/pipeline/engine/workflows.py
Original file line number Diff line number Diff line change
Expand Up @@ -598,7 +598,11 @@ def run(self, plugin=None, plugin_args=None, updatehash=False):
write_workflow_prov(execgraph, prov_base, format='all')

if config.resource_monitor:
write_workflow_resources(execgraph)
base_dir = self.base_dir or os.getcwd()
write_workflow_resources(
execgraph,
filename=op.join(base_dir, self.name, 'resource_monitor.json')
)
return execgraph

# PRIVATE API AND FUNCTIONS
Expand Down
29 changes: 16 additions & 13 deletions nipype/utils/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@


CONFIG_DEPRECATIONS = {
'profile_runtime': ('resource_monitor', '1.0'),
'filemanip_level': ('utils_level', '1.0'),
'profile_runtime': ('monitoring.enabled', '1.0'),
'filemanip_level': ('logging.utils_level', '1.0'),
}

NUMPY_MMAP = LooseVersion(np.__version__) >= LooseVersion('1.12.0')
Expand Down Expand Up @@ -71,8 +71,11 @@
parameterize_dirs = true
poll_sleep_duration = 2
xvfb_max_wait = 10
resource_monitor = false
resource_monitor_frequency = 1

[monitoring]
enabled = false
sample_frequency = 1
summary_append = true

[check]
interval = 1209600
Expand Down Expand Up @@ -105,12 +108,12 @@ def __init__(self, *args, **kwargs):
self._config.read([config_file, 'nipype.cfg'])

for option in CONFIG_DEPRECATIONS:
for section in ['execution', 'logging']:
for section in ['execution', 'logging', 'monitoring']:
if self.has_option(section, option):
new_option = CONFIG_DEPRECATIONS[option][0]
if not self.has_option(section, new_option):
new_section, new_option = CONFIG_DEPRECATIONS[option][0].split('.')
if not self.has_option(new_section, new_option):
# Warn implicit in get
self.set(section, new_option, self.get(section, option))
self.set(new_section, new_option, self.get(section, option))

def set_default_config(self):
self._config.readfp(StringIO(default_cfg))
Expand Down Expand Up @@ -138,7 +141,7 @@ def get(self, section, option, default=None):
'"%s" instead.') % (option, CONFIG_DEPRECATIONS[option][1],
CONFIG_DEPRECATIONS[option][0])
warn(msg)
option = CONFIG_DEPRECATIONS[option][0]
section, option = CONFIG_DEPRECATIONS[option][0].split('.')

if self._config.has_option(section, option):
return self._config.get(section, option)
Expand All @@ -154,7 +157,7 @@ def set(self, section, option, value):
'"%s" instead.') % (option, CONFIG_DEPRECATIONS[option][1],
CONFIG_DEPRECATIONS[option][0])
warn(msg)
option = CONFIG_DEPRECATIONS[option][0]
section, option = CONFIG_DEPRECATIONS[option][0].split('.')

return self._config.set(section, option, value)

Expand Down Expand Up @@ -222,8 +225,8 @@ def resource_monitor(self):
return self._resource_monitor

# Cache config from nipype config
self.resource_monitor = self._config.get(
'execution', 'resource_monitor') or False
self.resource_monitor = str2bool(self._config.get(
'monitoring', 'enabled')) or False
return self._resource_monitor

@resource_monitor.setter
Expand All @@ -248,7 +251,7 @@ def resource_monitor(self, value):
if not self._resource_monitor:
warn('Could not enable the resource monitor: psutil>=5.0'
' could not be imported.')
self._config.set('execution', 'resource_monitor',
self._config.set('monitoring', 'enabled',
('%s' % self._resource_monitor).lower())

def enable_resource_monitor(self):
Expand Down
13 changes: 6 additions & 7 deletions nipype/utils/profiler.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# -*- coding: utf-8 -*-
# @Author: oesteban
# @Date: 2017-09-21 15:50:37
# @Last Modified by: oesteban
# @Last Modified time: 2017-10-20 09:12:36
# emacs: -*- mode: python; py-indent-offset: 4; indent-tabs-mode: nil -*-
# vi: set ft=python sts=4 ts=4 sw=4 et:
"""
Utilities to keep track of performance
"""
Expand Down Expand Up @@ -202,8 +200,8 @@ def get_max_resources_used(pid, mem_mb, num_threads, pyfunc=False):
"""

if not resource_monitor:
raise RuntimeError('Attempted to measure resources with '
'"resource_monitor" set off.')
raise RuntimeError('Attempted to measure resources with option '
'"monitoring.enabled" set off.')

try:
mem_mb = max(mem_mb, _get_ram_mb(pid, pyfunc=pyfunc))
Expand Down Expand Up @@ -320,7 +318,8 @@ def _use_cpu(x):
ctr = 0
while ctr < 1e7:
ctr += 1
x*x
x * x


# Spin multiple threads
def _use_resources(n_procs, mem_gb):
Expand Down