-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some efficiency savings for pycbc_fit_sngls_over_multiparam #4957
base: master
Are you sure you want to change the base?
Some efficiency savings for pycbc_fit_sngls_over_multiparam #4957
Conversation
@@ -68,7 +69,6 @@ def smooth_templates(nabove, invalphan, ntotal, template_idx, | |||
Third float: the smoothed total count in template value | |||
|
|||
""" | |||
if weights is None: weights = numpy.ones_like(template_idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change allows numpy.average to revert to numpy.mean, which saves some cost on this operation
n_templates = len(nabove) | ||
rang = numpy.arange(0, n_templates) | ||
|
||
nabove_smoothed = numpy.zeros_like(parvals[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This preallocation saves some time, writing to indexes of the array rather than appending
nabove_sort = nabove[par_sort] | ||
invalphan_sort = invalphan[par_sort] | ||
ntotal_sort = ntotal[par_sort] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the dominant saving cost - this was being done within the loop, meaning an
I feel someone else should review this, given my involvement in creating this. |
Add in a few efficiency savings for
pycbc_fit_sngls_over_multiparam
Standard information about the request
This is an efficiency update
This change affects the offline search, the live search
This change changes nothing for the output
This change follows style guidelines (See e.g. PEP8), has been proposed using the contribution guidelines
Motivation
I was looking at
pycbc_fit_sngls_over_multiparam
thinking it would be a good place to learn / implement some GPU efficiency, but then we (mainly Ian) saw that there were some huge efficiency savings that could be made fairly quicklyContents
Biggest saving:
Additional savings:
numpy.average()
to revert to the more-efficientnumpy.mean()
method rather than using equal weights.Testing performed
Note that for this testing, I set the loop over templates to break at 10,000 iterations.
This meant that for the "old" testing, I needed to implement the pre-allocated array saving described above in order to get outputs to match
Differences in output files
All equivalent files' datasets match using equality (i.e.
numpy.array_equal(dataset1, dataset2)
), dtypes match and attributes match.Hashes of files do not match - I am unsure how this can be the case though.
Profiling
smooth_tophat (default) smoothing:
Summary:
The "old" profiling graph shows that something which is not in a function is the dominant cost. I found that by setting the
smooth()
function to not be called this drastically improved performance. We found that the problem was in the arguments being passed to the function, not the function itself.time
output shows that the 'new' method takes approx 1/35 of the 'old' time.There are also a bunch of page faults and voluntary context switches in the old version, which I don't know what they are, but it sounds bad.
Profiling graphs:
new and old
time -v
output:Old:
New:
distance_weighted smoothing
Summary:
In this, there is the extra cost of generating a normal PDF for every template, this means that the savings aren't quite as significant, but still noteworthy
Profiling graphs:
new
and old
time -v
output:old:
new:
n_closest smoothing
This is essentially unchanged, as the major savings part of the PR is not affecting this code path.
For completeness, profiles can be found at this link under
fit_over_n_closest_{new,old}.{txt,png}
for new/old andtime
output/ profiling graph respectively