Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Alternative nuget with decimal as internal representation #285

Closed
angularsen opened this issue Sep 8, 2017 · 14 comments
Closed

Comments

@angularsen
Copy link
Owner

angularsen commented Sep 8, 2017

Proposal

  • New nugets: UnitsNet.Decimal, UnitsNet.Decimal.Signed, UnitsNet.Serialization.JsonNet.Decimal, UnitsNet.Serialization.JsonNet.Decimal.Signed
  • Change internal representation from double to decimal

Why

Decimal has some benefits, such as fixed precision (predictable rounding) and high precision.
However, duplicating all types would nearly double the nuget size, so we propose to create a new nuget using decimal as internal representation as well as the output type.

Example:

double meters = Length.FromMeters(10).Meters; // currently
decimal meters = Length.FromMeters(10).Meters; // new nuget

Notes

UnitsNet.WindowsRuntimeComponent nuget is left out due to decimal not supported by type constraints.

@0xferit
Copy link
Contributor

0xferit commented Nov 9, 2017

I think we can use generics to support decimal internal representation without doubling source-code size.

Perhaps with a usage like this:

double meters = Length<double>.FromMeters(10).Meters; // double internal representation
decimal meters = Length<decimal>.FromMeters(10).Meters; // decimal internal representation

And actually, if I'm not mistaken, C# is smart enough to infer types so we can omit generic type arguments and use like this:

double meters = Length.FromMeters(10).Meters; // double internal representation
decimal meters = Length.FromMeters(10).Meters; // decimal internal representation

@angularsen
Copy link
Owner Author

angularsen commented Nov 9, 2017

We could go this route, but I don't want to force usage to include the <double> part all over the place. Unfortunately there are many cases when it can't infer the type. Often when numbers are coded, they are integers and that would have to be explicitly specified as 1d or 1m. Some people don't even know about that possibility, and will use <double> to work around it instead and get a poor first impression. The nullable overloads are also particularly bad at type inference.

Even if we accepted this, it is not trivial to work with generic types internally. You can't readily multiply or add two generic numbers together. You either need to check the type and handle the calculation separately (which does not fix the size problem), or use some arithmetic abstraction code such as this. But.. it adds overhead and feels complicated. I'm not a fan, but it might be a way do it and maybe it works better than my prejudice tells me.

I've used Math.NET and it has the same thing with matrices, that can be either double, decimal, long and possibly more. They solved it by having both a Matrix<double> type as well as a separate "double" namespace with Matrix, Vector and other types written assuming double.

It worked quite well, but the same approach would cost us almost N times the nuget size.

So.. the way I see it we have these options:

  1. Two nugets - but this causes its own problems, such as when people depend on library A that depends on UnitsNetDouble and they also depend on library B that depends on UnitsNetDecimal.
  2. One nuget - twice the fun, if you think bytes are fun
  3. One nuget - generics and clever arithmetic

@angularsen
Copy link
Owner Author

Also, there is a 4th option to trim down the size. If possible, and I'm not yet sure if it is, we could split the nuget into a core/main nuget following the 80/20 rule (the top most used units, that satisfy 80% of users while only including about 20% of the units) plus the core building blocks: UnitSystem for parsing/ToString of unit abbreviations and UnitConverter for dynamic conversion.

Then we add all the other units as extensible nugets, that add additional quantities and/or units on top; such as electrical engineering, chemistry and other domains.

I still don't know how we deal with units that overlap domains and how to build a truly extensible unit system, because the biggest win is strongly typed quantities/units and the only way I see we can achieve that is by using extension methods.
Finally, there are tons of units that are simply prefixes of a base unit. We could get rid of all those combinations by instead doing something like Mass m = 1.Mega().Tonne() or 1.Tonne(Prefix.Mega), but.. ugh... I really don't like that syntax and it's a lot less discoverable than having all the units listed up by intellisense, but it's one way to really cut down on size.

@TrevorVonSeggern
Copy link
Contributor

After some research and a few console application projects, I don't think it's possible to have a generics with clever arithmetic without a breaking code change that also has good performance. If the quantity files are generic, then the user would have to specify the generic type every time it's consumed. I originally tried to solve this by having all the quantities be generic, and then create concrete types with the default(double), and other(decimal, int, etc) generics defined in the type itself. However this isn't possible (to the best of my knowledge) with structs, only classes (by inheritance).

@TrevorVonSeggern
Copy link
Contributor

I created an example project of how I used the generic arithmetic here. (based on the codeproject link above)

@angularsen
Copy link
Owner Author

Thanks for looking into this and sharing the details. Yes, it is annoying that structs do not support inheritance. Do you see a way to accomplish your generic arithmetic if we changed to using classes instead? It could be a tradeoff to consider.

@TrevorVonSeggern
Copy link
Contributor

Yes I definitely think it's possible. I would have to create separate namespaces for each type, or name each concrete type differently (lengthDouble, lengthDecimal, etc).

The only downside to using classes here instead of structs that I see is that classes are a reference type, so there is another call in memory to get the location of the class on the heap, and memory locality might be poor. This would only be a problem for software that requires really optimized, efficient unit conversions. I don't think that will be an issue for the majority of programs that use this package.

@0xferit
Copy link
Contributor

0xferit commented Dec 18, 2017

Thanks for looking into this @TrevorVonSeggern . I believe using classes for quantities and units will be helpful in this and various other issues.

Therefore I'll cast my vote for generics with classes.

@angularsen
Copy link
Owner Author

We shouldn't take lightly on the performance and memory implications of moving from struct to class, but I agree we should definitely look into what benefits it gives us and what drawbacks it introduces. UnitsNet is not a high performance library, neither is it a high-precision library. In my mind, at least, it is a convenience library. I know @eriove has been using this on applications with memory and performance constraints and would probably have a lot to say on the case.

@eriove
Copy link
Contributor

eriove commented Dec 18, 2017

We've been using the quantities in UnitsNet instead of ordinary doubles in all our code. The difference to suing doubles is very small at the moment, and is out weighted by compile time dimension analysis and easier communication with our distributed team. References types just wouldn't work. I haven't done any measurements on decimal so I'm not sure how that would hold up.

To me having one library with decimal and one with double as a backing type seems like the best solution.

@TrevorVonSeggern
Copy link
Contributor

I think it's a little pointless to debate performance before the code is actually measured. I'll see what I can do to get the ball rolling and then collect data on speed and memory usage to see if it makes a meaningful difference.

angularsen added a commit that referenced this issue Feb 11, 2018
Serialize with constructed Unit and _value instead of base unit and base unit value.
By using _value we serialize using the original value type (decimal vs double),
since UnitsNet public API is primarily using double.
Cast to QuantityValueDecimal for types that use decimal internally.

NOTE: This mix of double and decimal representations is a bit messy right
now and needs a better fix. We already discuss this topic in #285 about either
moving everything to decimal or to better support multiple numeric types
(float, double, long, decimal etc).
angularsen added a commit that referenced this issue Feb 11, 2018
Serialize with constructed Unit and _value instead of base unit and base unit value.
By using _value we serialize using the original value type (decimal vs double),
since UnitsNet public API is primarily using double.
Cast to QuantityValueDecimal for types that use decimal internally.

NOTE: This mix of double and decimal representations is a bit messy right
now and needs a better fix. We already discuss this topic in #285 about either
moving everything to decimal or to better support multiple numeric types
(float, double, long, decimal etc).
angularsen added a commit that referenced this issue Feb 11, 2018
Serialize with constructed Unit and _value instead of base unit and base unit value.
By using _value we serialize using the original value type (decimal vs double),
since UnitsNet public API is primarily using double.
Cast to QuantityValueDecimal for types that use decimal internally.

NOTE: This mix of double and decimal representations is a bit messy right
now and needs a better fix. We already discuss this topic in #285 about either
moving everything to decimal or to better support multiple numeric types
(float, double, long, decimal etc).
angularsen added a commit that referenced this issue Feb 11, 2018
Serialize with constructed Unit and _value instead of base unit and base unit value.
By using _value we serialize using the original value type (decimal vs double),
since UnitsNet public API is primarily using double.
Cast to QuantityValueDecimal for types that use decimal internally.

NOTE: This mix of double and decimal representations is a bit messy right
now and needs a better fix. We already discuss this topic in #285 about either
moving everything to decimal or to better support multiple numeric types
(float, double, long, decimal etc).
@TrevorVonSeggern
Copy link
Contributor

Sorry for the radio silence. I've taken a first pass at it. It's definitely a little more complicated than I originally thought it would be without introducing a major version change. I've forked the repo and pushed those changes if anybody is curious. It's not ready yet, but it's looking promising so far.

@angularsen
Copy link
Owner Author

Note that there has been some discussions in #372 on reducing code size, such as removing a LOT of seemingly redundant number extension methods - or moving them out of the main nuget altogether.

Also in #371 we discuss adding a quantity interface to serve as a base type, and since interfaces introduces boxing of structs you lose the (potential) performance benefit of structs in those scenarios where you need to work with the the interface instead of each specific quantity type.

angularsen added a commit that referenced this issue Feb 17, 2018
Serialize with constructed Unit and _value instead of base unit and base unit value.
By using _value we serialize using the original value type (decimal vs double),
since UnitsNet public API is primarily using double.
Cast to QuantityValueDecimal for types that use decimal internally.

NOTE: This mix of double and decimal representations is a bit messy right
now and needs a better fix. We already discuss this topic in #285 about either
moving everything to decimal or to better support multiple numeric types
(float, double, long, decimal etc).
angularsen added a commit that referenced this issue Feb 17, 2018
Serialize with constructed Unit and _value instead of base unit and base unit value.
By using _value we serialize using the original value type (decimal vs double),
since UnitsNet public API is primarily using double.
Cast to QuantityValueDecimal for types that use decimal internally.

NOTE: This mix of double and decimal representations is a bit messy right
now and needs a better fix. We already discuss this topic in #285 about either
moving everything to decimal or to better support multiple numeric types
(float, double, long, decimal etc).
angularsen added a commit that referenced this issue Mar 5, 2018
Serialize with constructed Unit and _value instead of base unit and base unit value.
By using _value we serialize using the original value type (decimal vs double),
since UnitsNet public API is primarily using double.
Cast to QuantityValueDecimal for types that use decimal internally.

NOTE: This mix of double and decimal representations is a bit messy right
now and needs a better fix. We already discuss this topic in #285 about either
moving everything to decimal or to better support multiple numeric types
(float, double, long, decimal etc).
@angularsen
Copy link
Owner Author

I no longer think this is a good idea. Instead, it seems more likely that we change from struct to class and unlock the capabilities of generics and inheritance to support different numeric types without having to create N types. Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants