-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New metric traits #101
New metric traits #101
Conversation
Looks interesting from a cursory look!
Do you mean something like |
The function would have to look more like Do you think it is worth maintaining a separate function for the euclidean norm? I'm thinking that it probably isn't right now but I am bothered by the incurred performance hit. For clarity we lose performance as the |
I think unless the performance hit is very, very substantial (and I suspect computing norms is rarely a bottleneck of your computation in any case?), my gut feeling is that we should focus on making the API as easy-to-use and consistent as possible for now. Later, when i.e. specialization arrives, we can hopefully recover any lost performance. I think Also, note that you will run into potentially fairly hefty performance penalties by specifying p-norms at runtime: for example, if |
The performance hit will be due to poor vectorization. Instead of vectorizing the dot operation over the whole matrix As for the Edit: Also the norm is fairly performance critical. We tend to use norms in machine learning to check our convergence - which means once per iteration, normally on fairly large matrices. |
I have the following left to do:
|
I think this is ready for a proper review now. I'll try to go over it all again but I'm fairly happy with it. There are things I'd hope to change if new rust features land but otherwise I think this is a good start. |
I've had another look through and I am happy to merge this now. I thought I'd ping @Andlon in case you want to take one more look through. If you don't have time then don't worry! I'm just trying to rush through some of these so we can get the 0.4 release out. |
Just about getting on a plane. I might have some time to look into it
tonight or tomorrow morning, if you can wait that long. If not then please
go ahead and merge :-)
…On 20 Dec 2016 11:14 a.m., "James Lucas" ***@***.***> wrote:
I've had another look through and I am happy to merge this now. I thought
I'd ping @Andlon <https://github.com/Andlon> in case you want to take one
more look through. If you don't have time then don't worry! I'm just trying
to rush through some of these so we can get the 0.4 release out.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#101 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAjTBfmzBABhO_WbJmmOqoLFn0hRe6IEks5rJ6qVgaJpZM4LKYD6>
.
|
I can happily wait a couple of days. In the mean time I'm working on refactoring the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found some time to look at it during transfer at Schiphol airport. Overall, I really like the changes, and it gives a flexible way for both rulinalg
itself to provide norm/metric implementations, as well as for contributors to provide their own. Cool!
My feedback is mostly nitpick, and doesn't really need to impede merge. I'll let you decide how you want to use it.
I just had one thought though: Would it make sense to provide runtime specializations for self.0 == 2.0
and self.0 == 1.0
in Lp
? This might provide some serious speedup for the most common norms. This is also not so important, and if this is even desirable it can be implemented in a later PR.
//! | ||
//! To define your own norm you need to implement the `MatrixNorm` | ||
//! and/or the `VectorNorm` on your own struct. When you have defined | ||
//! a norm you get the _induced metric_ for free. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mathematically the induced metric is very clear, but perhaps just spend one additional sentence on explaining what this means in Rust terms? I.e. that anything that implements MatrixNorm
/VectorNorm
also gets an automatic MatrixMetric
/VectorMetric
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree! Will add this before merging.
/// # p = infinity | ||
/// | ||
/// In the special case where `p` is positive infinity, | ||
/// the Lp norm becomes a supremum over the absolute values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since math formulas are very hard to present in Github Markdown, perhaps we could link to an appropriate Wikipedia section which discusses the issue? I.e. such as this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is probably a good idea, thanks.
for x in v { | ||
s = s + x.abs().powf(self.0); | ||
} | ||
s.powf(self.0.recip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I think this is perfectly fine, but in the future we may want to investigate whether we should implement compensated summation such as Kahan summation to improve accuracy due to rounding errors. I believe norm computations are very easily afflicted by such issues.
While this would require more floating point operations, it may not necessarily need to decrease performance so much since it's possibly that one is cache-bound in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be worth doing - we could add a utils
function for Kahan summation.
I've never used it myself before and I'm inclined to delay this to a later PR (my only reason for not doing so would be the fear that it would be forgotten). I think it is at least worth exploring whether this gives us any accuracy improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should not worry about this in this PR, I just mentioned it as we might want to have it in the back of our heads for the future. Perhaps we could create a long-term issue that discusses compensated summation throughout rulinalg
? Numerical drift is a rather pervasive problem in any mathematical algorithm that requires extensive summation (including matrix multiplication), and we might want to investigate in which scenarios we want to improve accuracy by e.g. compensated summation
#[test] | ||
fn test_euclidean_vector_norm() { | ||
let v = Vector::new(vec![3.0, 4.0]); | ||
assert!((VectorNorm::norm(&Euclidean, &v) - 5.0) < 1e-30); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that while this arbitrary tolerance 1e-30
works on your machine (and Travis), it may possibly give different results for different architectures (and as such fail on some other contributor's machine). I don't have a very good drop-in remedy for this, other than lowering it to something like 1e-14
or so, which also has its own problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree - a lower tolerance is probably the better of two evils here.
fn test_lp_vector_supremum() { | ||
let v = Vector::new(vec![-5.0, 3.0]); | ||
|
||
let sup = VectorNorm::norm(&Lp(f64::INFINITY), &v); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps also introduce static methods Lp::supremum()
and Lp::infimum()
? Or max
/min
, haven't thought this through very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, forget about infimum/min! (But might still be useful for supremum)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is also a good idea! I was toying with having Lp
be an enum:
pub enum Lp<T> {
Supremum,
Zero,
Normal(T),
// .. Other special cases
}
Or something like that but decided against it for now. Do you think this would be a worthwhile change?
Either way, adding these helper functions may push me into splitting each norm into it's own sub-module. (Which is not a bad thing).
Thank you for the review. There are definitely some things you pointed out that I would like to change before merging. As for the runtime handling of special cases - I think this could be worthwhile. I'll write up some benchmarks before this gets merged and we can get an idea of how big the difference is. If it is a big performance hit then I think adding these special case handlers is a good idea. Note that by using the |
I added some benchmarks, here are the results:
The summary:
I'll copy these benchmarks into the PR description. I think that with these results it is worth using the enum approach for the Lp norm. Something like: /// The Lp norm.
///
/// Usual stuff...
///
/// We use an enum here to allow performance specializations.
pub enum Lp<T: Float> {
/// The L-infinity norm (supremum)
pub Infinity,
/// The Lp norm where p is an integer
pub Integer(i32),
/// The Lp norm where p is a float
pub Float(T)
} How does this look to you @Andlon ? (Take your time to respond) |
The benchmarks are very interesting! I definitely agree that this warrants a solution which gives the compiler more information to work with, and I think your Did you by any chance check the Considering that In any case, I think the PR looks really good (especially with the enum if you want to go for that), so we can leave micro-optimizations for later work if you'd rather work on more pressing matters. At least the enum approach opens up for optimization work without further breaking changes to the API. |
I just checked the I'm going to go with the enum approach. It seems like the best of both - and as you say I like that we can add more special cases later without breaking the API. |
Yeah, for now there are bigger fish to fry. Better leave the micro-optimizations for later :) (Though, in the future, we could perhaps also get rid of Euclidean by special-casing |
I'm seeing something a little odd with the benchmarks after updating:
Most of these numbers look fine - the only questionable one is the L1 norm. We would expect it to be very close to the This isn't a huge concern to me and I'm still happy to merge. I thought I'd post it here in case you ( @Andlon ) can think of any reasons for this? |
I can't explain why the difference remains even though you used the exact same code. Perhaps try clearing your Also, by updating, do you mean updating your nightly version...? It's possible there's a regression on nightly. I guess the way to find out would be to go backwards through different nightly versions (or something like bisection), but I think maybe your time is better spent elsewhere, to be honest. |
I meant updating the PR. I'll try clearing the cache - if it still doesn't make a difference I wont worry about it too much. Benchmarking can be a little tempermental. |
This PR attempts to resolve #81. New features include:
MatrixNorm
andVectorNorm
traits.Some issues to work through:
Matrix
- can specialize the implementation if we use a default function for this. We will ignore this for now, but make a note for later.And some of the work left to do:
norm
function to theBaseMatrix
andVector
types so the user can specify which norm to use more easily.Here are some benchmarks (without any Lp optimizations):