Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New metric traits #101

Merged
merged 17 commits into from
Dec 23, 2016
Merged

New metric traits #101

merged 17 commits into from
Dec 23, 2016

Conversation

AtheMathmo
Copy link
Owner

@AtheMathmo AtheMathmo commented Dec 12, 2016

This PR attempts to resolve #81. New features include:

  • Can define multiple norms using the MatrixNorm and VectorNorm traits.
  • Automatically get the induced metric (fact 3) from the norm.

Some issues to work through:

  • Should we keep a default euclidean implementation? No
  • We have some performance hit for euclidean norm on Matrix - can specialize the implementation if we use a default function for this. We will ignore this for now, but make a note for later.
  • We have a set of traits each for vector and matrix types.

And some of the work left to do:

  • Add Lp norms.
  • Add documentation.
  • Add unit tests
  • Add a generic norm function to the BaseMatrix and Vector types so the user can specify which norm to use more easily.

Here are some benchmarks (without any Lp optimizations):

running 12 tests
test linalg::norm::lp_1_mat_100_50                 ... bench:       9,036 ns/iter (+/- 993)
test linalg::norm::lp_1_mat_10_50                  ... bench:         913 ns/iter (+/- 94)
test linalg::norm::sum_abs_mat_100_50              ... bench:       9,164 ns/iter (+/- 1,990)
test linalg::norm::sum_abs_mat_10_50               ... bench:         909 ns/iter (+/- 87)
test linalg::norm::lp_2_mat_100_50                 ... bench:      10,038 ns/iter (+/- 1,204)
test linalg::norm::lp_2_mat_10_50                  ... bench:       1,056 ns/iter (+/- 226)
test linalg::norm::euclidean_mat_100_50            ... bench:       3,266 ns/iter (+/- 392)
test linalg::norm::euclidean_mat_10_50             ... bench:         303 ns/iter (+/- 26)
test linalg::norm::lp_3_mat_100_50                 ... bench:     364,378 ns/iter (+/- 25,537)
test linalg::norm::lp_3_mat_10_50                  ... bench:      35,908 ns/iter (+/- 3,867)
test linalg::norm::sum_abs_powi_3_mat_100_50       ... bench:       9,436 ns/iter (+/- 1,171)
test linalg::norm::sum_abs_powi_3_mat_10_50        ... bench:         913 ns/iter (+/- 279)

@Andlon
Copy link
Collaborator

Andlon commented Dec 12, 2016

Looks interesting from a cursory look!

Add a generic norm function to the BaseMatrix and Vector types so the user can specify which norm to use more easily.

Do you mean something like mat.norm<Euclidean>()? I think this is a very good idea! Then we're explicit about which norm is being used. I can't keep track of how many times I've had to look up functions like .norm() in other libraries/languages to make sure they are computing what I think they do. Should greatly increase readability.

@AtheMathmo
Copy link
Owner Author

The function would have to look more like mat.norm(Euclidean). The reason why is that I figured some norms require some data to be stored about them, LpNorm for example. Because of that all of the trait functions work on &self. I think if we can still keep the syntax fairly clean with this though - for example using tuple structs so that the syntax would be mat.norm(Lp(2.0)).

Do you think it is worth maintaining a separate function for the euclidean norm? I'm thinking that it probably isn't right now but I am bothered by the incurred performance hit. For clarity we lose performance as the Euclidean norm acts on a BaseMatrix so we can't assume the whole data block is contiguous.

@Andlon
Copy link
Collaborator

Andlon commented Dec 12, 2016

I think unless the performance hit is very, very substantial (and I suspect computing norms is rarely a bottleneck of your computation in any case?), my gut feeling is that we should focus on making the API as easy-to-use and consistent as possible for now. Later, when i.e. specialization arrives, we can hopefully recover any lost performance. I think rulinalg is still so young that we can sacrifice some maximum performance for the sake of a better API, provided that these things can be remedied as Rust gains more features.

Also, note that you will run into potentially fairly hefty performance penalties by specifying p-norms at runtime: for example, if p is a float, then Lp(2.0) would have to perform a large amounts of fractional exponential operations instead of just simple elementwise multiplication x * x. That is, assuming that the compiler isn't extremely smart (I doubt it will recognize that exponentiating x to the power of 2.0 in floating-point is equivalent to x * x?). I'm not sure how to remedy this, however...

@AtheMathmo
Copy link
Owner Author

AtheMathmo commented Dec 12, 2016

The performance hit will be due to poor vectorization. Instead of vectorizing the dot operation over the whole matrix Vec we can only vectorize in chunks over the row. This shouldn't be too significant.

As for the Lp performance penalty, I think this is acceptable as we explicitly provide (somewhat) more efficient implementations with the Euclidean norm.

Edit: Also the norm is fairly performance critical. We tend to use norms in machine learning to check our convergence - which means once per iteration, normally on fairly large matrices.

@AtheMathmo
Copy link
Owner Author

I have the following left to do:

  • Unit tests for the Euclidean and Lp norms.
  • Documentation for the norm module. I want to specifically talk about the assumptions made by the traits and users elsewhere in rulinalg. This should help if somebody wants to implement a custom metric or norm.

@AtheMathmo AtheMathmo changed the title [WIP] New metric traits New metric traits Dec 17, 2016
@AtheMathmo
Copy link
Owner Author

I think this is ready for a proper review now. I'll try to go over it all again but I'm fairly happy with it.

There are things I'd hope to change if new rust features land but otherwise I think this is a good start.

@AtheMathmo AtheMathmo mentioned this pull request Dec 17, 2016
5 tasks
@AtheMathmo AtheMathmo added this to the 0.4 milestone Dec 18, 2016
@AtheMathmo
Copy link
Owner Author

I've had another look through and I am happy to merge this now. I thought I'd ping @Andlon in case you want to take one more look through. If you don't have time then don't worry! I'm just trying to rush through some of these so we can get the 0.4 release out.

@Andlon
Copy link
Collaborator

Andlon commented Dec 20, 2016 via email

@AtheMathmo
Copy link
Owner Author

I can happily wait a couple of days. In the mean time I'm working on refactoring the impl_ops and slice modules. Thank you!

Copy link
Collaborator

@Andlon Andlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some time to look at it during transfer at Schiphol airport. Overall, I really like the changes, and it gives a flexible way for both rulinalg itself to provide norm/metric implementations, as well as for contributors to provide their own. Cool!

My feedback is mostly nitpick, and doesn't really need to impede merge. I'll let you decide how you want to use it.

I just had one thought though: Would it make sense to provide runtime specializations for self.0 == 2.0 and self.0 == 1.0 in Lp? This might provide some serious speedup for the most common norms. This is also not so important, and if this is even desirable it can be implemented in a later PR.

//!
//! To define your own norm you need to implement the `MatrixNorm`
//! and/or the `VectorNorm` on your own struct. When you have defined
//! a norm you get the _induced metric_ for free.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mathematically the induced metric is very clear, but perhaps just spend one additional sentence on explaining what this means in Rust terms? I.e. that anything that implements MatrixNorm/VectorNorm also gets an automatic MatrixMetric/VectorMetric?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! Will add this before merging.

/// # p = infinity
///
/// In the special case where `p` is positive infinity,
/// the Lp norm becomes a supremum over the absolute values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since math formulas are very hard to present in Github Markdown, perhaps we could link to an appropriate Wikipedia section which discusses the issue? I.e. such as this

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is probably a good idea, thanks.

for x in v {
s = s + x.abs().powf(self.0);
}
s.powf(self.0.recip())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I think this is perfectly fine, but in the future we may want to investigate whether we should implement compensated summation such as Kahan summation to improve accuracy due to rounding errors. I believe norm computations are very easily afflicted by such issues.

While this would require more floating point operations, it may not necessarily need to decrease performance so much since it's possibly that one is cache-bound in the first place.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be worth doing - we could add a utils function for Kahan summation.

I've never used it myself before and I'm inclined to delay this to a later PR (my only reason for not doing so would be the fear that it would be forgotten). I think it is at least worth exploring whether this gives us any accuracy improvements.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should not worry about this in this PR, I just mentioned it as we might want to have it in the back of our heads for the future. Perhaps we could create a long-term issue that discusses compensated summation throughout rulinalg? Numerical drift is a rather pervasive problem in any mathematical algorithm that requires extensive summation (including matrix multiplication), and we might want to investigate in which scenarios we want to improve accuracy by e.g. compensated summation

#[test]
fn test_euclidean_vector_norm() {
let v = Vector::new(vec![3.0, 4.0]);
assert!((VectorNorm::norm(&Euclidean, &v) - 5.0) < 1e-30);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that while this arbitrary tolerance 1e-30 works on your machine (and Travis), it may possibly give different results for different architectures (and as such fail on some other contributor's machine). I don't have a very good drop-in remedy for this, other than lowering it to something like 1e-14 or so, which also has its own problems.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree - a lower tolerance is probably the better of two evils here.

fn test_lp_vector_supremum() {
let v = Vector::new(vec![-5.0, 3.0]);

let sup = VectorNorm::norm(&Lp(f64::INFINITY), &v);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps also introduce static methods Lp::supremum() and Lp::infimum()? Or max/min, haven't thought this through very much.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, forget about infimum/min! (But might still be useful for supremum)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is also a good idea! I was toying with having Lp be an enum:

pub enum Lp<T> {
    Supremum,
    Zero,
    Normal(T),
    // .. Other special cases
}

Or something like that but decided against it for now. Do you think this would be a worthwhile change?

Either way, adding these helper functions may push me into splitting each norm into it's own sub-module. (Which is not a bad thing).

@AtheMathmo
Copy link
Owner Author

Thank you for the review. There are definitely some things you pointed out that I would like to change before merging.

As for the runtime handling of special cases - I think this could be worthwhile. I'll write up some benchmarks before this gets merged and we can get an idea of how big the difference is. If it is a big performance hit then I think adding these special case handlers is a good idea.

Note that by using the Lp enum idea that I suggested in an above comment we can also alleviate some of this issue. For example, having an Integer(i32) variant would let us use the f64::powi which would be faster.

@AtheMathmo
Copy link
Owner Author

AtheMathmo commented Dec 20, 2016

I added some benchmarks, here are the results:

running 12 tests
test linalg::norm::lp_1_mat_100_50                 ... bench:       9,036 ns/iter (+/- 993)
test linalg::norm::lp_1_mat_10_50                  ... bench:         913 ns/iter (+/- 94)
test linalg::norm::sum_abs_mat_100_50              ... bench:       9,164 ns/iter (+/- 1,990)
test linalg::norm::sum_abs_mat_10_50               ... bench:         909 ns/iter (+/- 87)
test linalg::norm::lp_2_mat_100_50                 ... bench:      10,038 ns/iter (+/- 1,204)
test linalg::norm::lp_2_mat_10_50                  ... bench:       1,056 ns/iter (+/- 226)
test linalg::norm::euclidean_mat_100_50            ... bench:       3,266 ns/iter (+/- 392)
test linalg::norm::euclidean_mat_10_50             ... bench:         303 ns/iter (+/- 26)
test linalg::norm::lp_3_mat_100_50                 ... bench:     364,378 ns/iter (+/- 25,537)
test linalg::norm::lp_3_mat_10_50                  ... bench:      35,908 ns/iter (+/- 3,867)
test linalg::norm::sum_abs_powi_3_mat_100_50       ... bench:       9,436 ns/iter (+/- 1,171)
test linalg::norm::sum_abs_powi_3_mat_10_50        ... bench:         913 ns/iter (+/- 279)

The summary:

  • We don't need to specialize for Lp(1.0).
  • Specializing for p=2 gives us an ~3 times speedup.
  • Specializing for integers (at compile time) gives us a huge speedup.

I'll copy these benchmarks into the PR description. I think that with these results it is worth using the enum approach for the Lp norm. Something like:

/// The Lp norm.
///
/// Usual stuff...
///
/// We use an enum here to allow performance specializations.
pub enum Lp<T: Float> {
    /// The L-infinity norm (supremum)
    pub Infinity,
    /// The Lp norm where p is an integer
    pub Integer(i32),
    /// The Lp norm where p is a float
    pub Float(T)
}

How does this look to you @Andlon ? (Take your time to respond)

@Andlon
Copy link
Collaborator

Andlon commented Dec 21, 2016

The benchmarks are very interesting! I definitely agree that this warrants a solution which gives the compiler more information to work with, and I think your enum approach seems like a reasonable way to resolve this.

Did you by any chance check the powi approach vs Euclidean?

Considering that p = 3 with powi takes about 3 times as much time as Euclidean (which of course is cheaper since it's p = 2), I wonder if the absolute value might be responsible for a considerable part of this extra runtime. That might warrant specializing even-valued p so that they avoid the redundant abs calls.

In any case, I think the PR looks really good (especially with the enum if you want to go for that), so we can leave micro-optimizations for later work if you'd rather work on more pressing matters. At least the enum approach opens up for optimization work without further breaking changes to the API.

@AtheMathmo
Copy link
Owner Author

I just checked the powi approach with p = 2 and got numbers very similar to p = 3. I was hoping it would be slower than Euclidean as we put work into vectorizing the Euclidean approach. I'm not sure why it doesn't have a significant improvement over p = 3 but I wont trouble myself with it...

I'm going to go with the enum approach. It seems like the best of both - and as you say I like that we can add more special cases later without breaking the API.

@Andlon
Copy link
Collaborator

Andlon commented Dec 21, 2016

Yeah, for now there are bigger fish to fry. Better leave the micro-optimizations for later :)

(Though, in the future, we could perhaps also get rid of Euclidean by special-casing p == 2 in Lp, I think. It's just a little unfortunate to have to maintain two pieces of functionality which amount to the same thing, especially if we want to introduce more elaborate summation etc. at a later point)

@AtheMathmo
Copy link
Owner Author

I'm seeing something a little odd with the benchmarks after updating:

running 14 tests
test linalg::norm::lp_1_mat_100_50                 ... bench:      24,530 ns/iter (+/- 3,244)
test linalg::norm::lp_1_mat_10_50                  ... bench:       2,271 ns/iter (+/- 198)
test linalg::norm::sum_abs_mat_100_50              ... bench:       9,035 ns/iter (+/- 103)
test linalg::norm::sum_abs_mat_10_50               ... bench:       1,495 ns/iter (+/- 131)
test linalg::norm::euclidean_mat_100_50            ... bench:       3,222 ns/iter (+/- 72)
test linalg::norm::euclidean_mat_10_50             ... bench:         310 ns/iter (+/- 26)
test linalg::norm::lp_2_mat_100_50                 ... bench:      27,139 ns/iter (+/- 422)
test linalg::norm::lp_2_mat_10_50                  ... bench:       2,720 ns/iter (+/- 151)
test linalg::norm::lp_float_2_mat_100_50           ... bench:      60,295 ns/iter (+/- 1,848)
test linalg::norm::lp_float_2_mat_10_50            ... bench:       6,109 ns/iter (+/- 339)
test linalg::norm::lp_3_mat_100_50                 ... bench:      25,657 ns/iter (+/- 2,252)
test linalg::norm::lp_3_mat_10_50                  ... bench:       2,587 ns/iter (+/- 226)
test linalg::norm::lp_float_3_mat_100_50           ... bench:     367,841 ns/iter (+/- 32,373)
test linalg::norm::lp_float_3_mat_10_50            ... bench:      36,359 ns/iter (+/- 3,169)

Most of these numbers look fine - the only questionable one is the L1 norm. We would expect it to be very close to the sum_abs benchmarks. In fact I even tried modifying the L1 norm so that it is exactly the same but still the difference remains. All I can think of is that the sub_abs test works on f64 instead of generics and maybe this allows some vectorization?

This isn't a huge concern to me and I'm still happy to merge. I thought I'd post it here in case you ( @Andlon ) can think of any reasons for this?

@Andlon
Copy link
Collaborator

Andlon commented Dec 22, 2016

I can't explain why the difference remains even though you used the exact same code. Perhaps try clearing your target directory? Might be some cache corruption issue?

Also, by updating, do you mean updating your nightly version...? It's possible there's a regression on nightly. I guess the way to find out would be to go backwards through different nightly versions (or something like bisection), but I think maybe your time is better spent elsewhere, to be honest.

@AtheMathmo
Copy link
Owner Author

I meant updating the PR. I'll try clearing the cache - if it still doesn't make a difference I wont worry about it too much. Benchmarking can be a little tempermental.

@AtheMathmo AtheMathmo merged commit 015b6dc into master Dec 23, 2016
@AtheMathmo AtheMathmo deleted the new-metric-trait branch December 24, 2016 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Updates to Metric trait
2 participants