-
Notifications
You must be signed in to change notification settings - Fork 66
[ML] Encode distribution model weight style by offset in a fixed size weight array #54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Encode distribution model weight style by offset in a fixed size weight array #54
Conversation
…ed size weight array
droberts195
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of minor comments.
I would also be nice to add a comment to the PR saying which dataset and config you saw the speedup on and timings before and after the change.
| void propagateLastThreadAssert() { | ||
| if (m_LastException) { | ||
| throw *m_LastException; | ||
| throw * m_LastException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this must be something about the version of clang format I'm using. (Although I thought I was using the same version as Ed.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used clang-format 5.0.1 (for the record)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also got 5.0.1, having download the pre-built binaries from http://releases.llvm.org/download.html
On running dev-tools/clang-format.sh on this PR branch it reverted this change and also made one other change in CMultivariateNormalConjugate.h.
It seems that we're going to have to mandate an exact version of clang-format for each branch rather than just the major version before we can start failing builds due to formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I'm using 5.0.0, so I think this must be the reason. I'll correct the formatting here.
| return {x1, x2}; | ||
| } catch (const std::exception& e) { | ||
| LOG_ERROR(<< "Failed to compute confidence interval: " << e.what()); | ||
| LOG_ERROR("Failed to compute confidence interval: " << e.what()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should stick to the pattern of beginning all log statements with << even though it's not necessary when the first item is a string literal. Otherwise the rules about when a leading << is required will be very complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this was a merge error.
| double shift = (r - t) / 2.0; | ||
| logSamplesMoments.add(std::log(x) - shift, n / scale); | ||
| } | ||
| } catch (const std::exception& e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some quite deeply nested and complex logic in the block above. Are you certain none of it uses Boost or Eigen functionality that might throw an exception? If it does then such an exception will now be fatal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is now safe to remove. The nested code is basic statistics stuff, Gauss-Legendre integration and in the local class CLogSampleSquareDeviation none of which can throw.
I actually thought a bit about whether or not to remove this try catch. (For example, there are other try catch blocks I could down scope as a result of this change, which I haven't touched.) In the end I decided since I could remove it altogether that I would go ahead and make the change on understandability grounds.
|
The original profile which showed this up was a large population analysis attached to issue #53. The probability calculation cache change significantly altered the breakdown of runtime for population analysis, so this is no longer as useful a test case. I verified profiling a custom standalone executable I built that this change kills the contribution from weight look up to end-of-bucket processing. I was planning to extract the delta in runtime on the full QA regression suite when this is committed and update #53. The problem is that we need cases where the runtime bottleneck is end of bucket processing of the autodetect process, which isn't always the case. The clearest results will be for high cardinality partition analyses running the autodetect process standalone. I'll add runtimes before and after this change for these cases to issue #53. |
droberts195
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This should have been done in #54 but slipped through the net as we compile out trace logging in optimised builds.
|
When this is backported to 6.4 please also backport 1a750aa. |
This should have been done in #54 but slipped through the net as we compile out trace logging in optimised builds.
Profiling anomaly detection on a large population (cardinality 2m) showed up that accessing weights, which communicate things like sample importance, seasonal heteroskedasticity, and so on, consumes about 8.5% of the total end of bucket processing time. We have a small number of possible weights. Therefore, switching to encoding the weight style by offset in a fixed size weights array means we can avoids nearly all this overhead. I get a concomitant performance improvement on highly partitioned analysis, where end of bucket processing is the bottleneck. This is should have no impact on any results. A step towards #53.