Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement TryFrom for float to integer types #47857

Closed
wants to merge 1 commit into from
Closed

Implement TryFrom for float to integer types #47857

wants to merge 1 commit into from

Conversation

udoprog
Copy link
Contributor

@udoprog udoprog commented Jan 29, 2018

This implements TryFrom for f32, and f64 to all integer types.

This is accomplished by casting the floating point number, into the target type and back, and then comparing if the resulting value is identical to the converted value.

I'm not sure this implementation is sound, in particular issues like #10184 seem to indicate that these kinds of casts are not always sound. A review by someone who knows more about floating point numbers, and what kind of guarantees the casts give us would be highly appreciated.

If there is a better way to determine if a given floating point number can be converted into an integer, I'd also love to know.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @BurntSushi (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Jan 29, 2018

IIUC this would mean all rounding would result in an error? e.g., i32::try_from(0.1) will return an error instead of Ok(0)? Is that really the semantics we want? It's lossless, but it's also exceedingly unlikely that any non-trivial float calculation will result in an exact integer unless the magnitude is so large that all floats are integers (which can't even happen for fM -> {u,i}N where N < M). So I don't really see a use case for it. I'd rather see rounding, and only return an error for finite values that don't fit into the target type (i.e., rounding error would exceed 0.5 ULP), infinities, and NaN.

@hanna-kruppe
Copy link
Contributor

Also for the record the current implementation indeed isn't sound bc of #10184 but let's worry about the implementation once we agree on what should even be implemented.

@udoprog
Copy link
Contributor Author

udoprog commented Jan 29, 2018

IIUC this would mean all rounding would result in an error?
Yes, the motivation behind this was to only check that the destination type could contain the value.

Rounding and other destructive operations seems like something you could comfortably do outside of conversion (e.g. v.round().try_into()?). Performing a checked non-destructive float-to-integer cast is currently hard (if possible?).

For the kind of use cases I have in mind, consider this GitHub search (round then cast):
https://github.com/search?l=Rust&p=2&q=%22round%28%29+as%22&type=Code&utf8=%E2%9C%93

At the same time I agree that the exact meaning of "conversion" isn't strongly defined. So TryFrom could end up doing something else which is deemed more natural. Maybe there's inspiration from other languages?

I know at least I would still have a need for these kinds of conversion methods, even if they take some other form.

@petrochenkov
Copy link
Contributor

I didn't understand why lossless From<integer> was implemented for floats and now I understand even less why lossless TryFrom<integer> should be implemented.
The use for these semantics seems very fringe and probably shouldn't hide under such generic traits as From and TryFrom.

@hanna-kruppe
Copy link
Contributor

@udoprog Note that float->int conversions don't round in the same sense as round. The former truncates or rounds down, the latter rounds up when the fractional part is > .5 (and to the even integer if it's = .5). So the checked version of as casts would be .trunc().try_from()?. I don't see anyone getting that right on the first try, I for one had to look it up myself (and didn't even realize this subtlety when I wrote the first post -- it contains errors and misleading terminology because of that).

Regardless, I still don't see the use case for non-rounding ("non-destructive") casts. Your link shows use cases for rounding casts, though rounding differently than the default cast. It's true that the floats that get casted there should be integers already and thus those snippets could use the proposed TryFrom impl. However, one usually uses checked conversions when there's a realistic possibility of information loss -- in these cases, it's trivial to see that the casted value has no fractional part by just looking at the code, so the sole motivation for the checked cast would be erroring on infinities and NaN, which would also be served by a checked cast that, well, checks just for that and permits rounding.


But @petrochenkov has a good point. Do we really want, for example, a function that takes a TryFrom<usize> parameter to accept a float value silently?

@udoprog
Copy link
Contributor Author

udoprog commented Jan 30, 2018

@rkruppe Yeah. I was using your example. But as you noted, it works the same regardless of operation (round, floor, ceil, ...). This is also the reasoning I use to not include some version of it in the conversion. In the proposed change any of them can be used to perform the round-trip check.

Regardless, I still don't see the use case for non-rounding ("non-destructive") casts. Your link shows use cases for rounding casts, though rounding differently than the default cast. It's true that the floats that get casted there should be integers already and thus those snippets could use the proposed TryFrom impl. However, one usually uses checked conversions when there's a realistic possibility of information loss -- in these cases, it's trivial to see that the casted value has no fractional part by just looking at the code, so the sole motivation for the checked cast would be erroring on infinities and NaN, which would also be served by a checked cast that, well, checks just for that and permits rounding.

Infinity, NaN, software bugs which cause variables to take on undesirable values. Variable sources which are not statically determined. I have a hard time making blanket statements about the examples, but I can imagine at least some of them might actually want to treat out-of-bound values as an error.

I noticed a Jula Lang contributor commented on one of the bug reports I was digging through, and they appear to support this kind of conversion the way it is proposed here through Base.convert (their test is a sign-dependent bounds check and trunc(x) == x): https://docs.julialang.org/en/stable/stdlib/base/#Base.convert


@petrochenkov If you are referring to lossless_float_conv, that feature appears to be the other way around of what is proposed in this PR (u8 -> f32, u16 -> f32, ...) vs. (f32 -> u8, ...). But if you can elaborate on why you oppose this that would be great!

@rkruppe

But @petrochenkov has a good point. Do we really want, for example, a function that takes a TryFrom parameter to accept a float value silently.

Silently, as in the compiler not complaining? I must admit this is the argument that made me doubt the validity of this PR the most. But ultimately lossless_float_conv being implemented would seem to validate it (somewhat) unless that feature is considered a mistake? I.e. we already have some float <-> int conversions happening through the From trait.

EDIT: A similar issue was discussed for lossless_float_conv in #29129, comment by @alexcrichton:

The libs team discussed this during triage yesterday, and the decision was to move forward with this. Our thinking was along the lines that if you're generically taking Into then you're already willing to accept multiple types, so the "weirdness" of accepting an integer of an appropriate size is actually expected.

Would a similar reasoning apply to this PR today?

@dbarella
Copy link

I'm interested in staying in the loop on this discussion -- I've been working on a similar change for the num package, and cuviper@ referred me here. I'd love to contribute what I can! The result of this discussion could affect how we end up writing the float-to-int conversion there -- hopefully simplifying the implementation, depending on how things go here.

@hanna-kruppe
Copy link
Contributor

But as you noted, it works the same regardless of operation (round, floor, ceil, ...). This is also the reasoning I use to not include some version of it in the conversion. In the proposed change any of them can be used to perform the round-trip check.

The question is whether there's any legitimate use case for doing this conversion without first rounding to a nearby integer. If there isn't (and I believe there isn't), it might be an attractive nuisance to have this half-working method -- people might use it on a toy example and not realize they generally need to add a rounding step manually to get Ok results. In that case it seems wiser to offer checked versions of one or multiple "rounding" casts.

Infinity, NaN, software bugs which cause variables to take on undesirable values. Variable sources which are not statically determined. I have a hard time making blanket statements about the examples, but I can imagine at least some of them might actually want to treat out-of-bound values as an error.

No debate about Infinities, NaN and out of bounds values. I'm not proposing TryFrom pretend it can meaningfully convert those to integers. I'm talking solely about finite values that would fit into the target type perfectly well after rounding (and thus, the conversion makes perfect sense logically, it just loses a tightly-bounded amount of precision, same as virtually all operations on floats do).

I noticed a Jula Lang contributor commented on one of the bug reports I was digging through, and they appear to support this kind of conversion the way it is proposed here through Base.convert (their test is a sign-dependent bounds check and trunc(x) == x): https://docs.julialang.org/en/stable/stdlib/base/#Base.convert

Interesting. I looked through a couple pages of github code search, my query wasn't very good so most results were unrelated but I did see some uses of this functionality. 🤔 Some were difficult to make sense of without reading all the surrounding code.

I did see n .* (n+1) / 2 where n was an array size (has type float due to the division, but mathematically is always an integer), and I don't consider that a particularly convincing example since it should really be all integer arithmetic -- for n larger than approx. 2^(53/2) you lose precision by going through float.

EDIT: A similar issue was discussed for lossless_float_conv in #29129, comment by @alexcrichton:
Would a similar reasoning apply to this PR today?

Interesting argument, but I'm not quite convinced. If TryFrom rounds, then you can pass 1.3 somewhere and have it treat it as 1 (which is not the case when you pass any of the integer types that implement TryFrom). If it doesn't round, then most values a program might organically generate are invalid (as I argued before, almost all results of non-trivial float calculations are not exactly integers) and thus the caller would probably have to be aware of this mismatch call one of the rounding methods anyway -- and thus it is not a large burden for them to also do the conversion to integer manually.

Basically, this sort of conversion seems more appropriate for monomorphic code rather than for generic APIs.

@kennytm kennytm added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 31, 2018
@udoprog
Copy link
Contributor Author

udoprog commented Jan 31, 2018

To try to keep things in check, there were two things being discussed:

  • If TryFrom for conversions to floats should round.
  • If it's OK for a generic method accepting TryFrom to take both integers and floats (see below).

To try to summarize the first point, these are the questions I'm currently considering. It might be useful as a focal point:

  • Should From/TryFrom conversions be lossless?
  • Is rounding expected during a default conversion?
  • Is rounding (even by a minuscule amount) a lossy or a lossless operation?

EDIT: A similar issue was discussed for lossless_float_conv in #29129, comment by @alexcrichton:
Would a similar reasoning apply to this PR today?

Interesting argument, but I'm not quite convinced. [..]

It seems you perceived I was using at as argument against rounding. It was actually a response to this:

But @petrochenkov has a good point. Do we really want, for example, a function that takes a TryFrom parameter to accept a float value silently?

So, are we OK with generic functions using TryFrom to accept both integers and floats?

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Jan 31, 2018

Because there are multiple possible ways to have fallible float -> int conversions, whereas there's only one obvious infallible int -> float conversion, I don't believe these two questions can be decoupled entirely. In particular, whether "it's OK for a generic method accepting TryFrom to take both integers and floats" depends on what it means for a function to accept a float and convert it to an integer with TryFrom. That's why I gave two different rationales depending on whether the conversion would round or not in this part:

Interesting argument, but I'm not quite convinced. [..]

Edit: Not to say the answer to the second question is necessarily different depending on whether TryFrom rounds (in fact I previously argued it isn't), but rounding influences why one might say yes or no to the second question.

@udoprog
Copy link
Contributor Author

udoprog commented Jan 31, 2018

Ok! I agree. But that does seem to indicate that we should decouple them, since your stance on the second question (If it's OK for a generic method accepting TryFrom to take both integers and floats) depends on the outcome of the first (If TryFrom for conversions to floats should round)?

@hanna-kruppe
Copy link
Contributor

There's a dependency but that doesn't necessarily mean it's best to settle the first question entirely before we turn to the second question, since if the answer to the second question is "no" and we don't implement TryFrom at all, the previous discussion was for naught. Well, it could inform the design of infallible conversions that don't use TryFrom but something else, but that discussion is somewhat different because then we could have multiple different conversions (e.g., rounding and not-rounding) side-by-side.

@udoprog
Copy link
Contributor Author

udoprog commented Jan 31, 2018

So this is me trying to wrap my head around the problem. Instead of trying to justify that there are float-based operations that would fall outside of exact conversion, I'll try to discuss whether it's sensible to pick a default policy to begin with, and what TryFrom/From means.

Which rounding policy should be used?

Some GitHub searching results in:

trunc() as, 132 occurrences.
round() as, 172 occurrences.
ceil() as, 371 occurrences.
floor() as, 260 occurrences.

Note that since casting floats in Rust is unsound (#10184), the cases which cast (and trunc) might not constitute an active choice. More “do something which makes this an integer”. This limited dataset leads me to believe that picking one policy runs the risk of making a large fraction of the community unhappy about the choice.

Making the conversion fail on inexact values can be a hint to the user that they should pick. This would be unfortunate to do at runtime, which would speak for instead having specialized conversion methods for floats (f32::to_i32_exact through a trait like ExactFloatConversion).

Outside of casting (which generally truncs), other strongly typed languages seem to force you to actively pick your policy as well.

From/TryFrom has been lossless so far

Many (all?) of the std trait implementations are lossless. Rounding a float is primarily a way to make members of one type fit into another. A similar but more extreme example of this would be to clampen or project an integer which is to wide to fit into a narrower sibling. The existing conversion methods do not attempt to “coerce” types in this manner.

It’s unfortunate that this is not stated clearly, since we are now having to deal with what users would expect out of the conversion traits, which is hard to quantify. But float conversions might also be so special that “lossless conversion” isn’t meaningful enough to warrant inclusion through TryFrom.

My personal observation is that the existing fallible conversion methods fail when information would otherwise be lost.

Forward?

If this change was just the introduction of a couple more TryFrom impls it would have been more straightforward. But it seems like it could be RFC-level in scope? How do we move forward?

@hanna-kruppe
Copy link
Contributor

But float conversions might also be so special that “lossless conversion” isn’t meaningful enough to warrant inclusion through TryFrom.

Yeah I think this summarizes a large part of my position towards this. You give good reasons that the TryFrom impl should not round, and if it did round it would not be obvious which choice would be the most generally useful, so I'm strongly leaning towards not providing these impls at all.

Maybe one step before a full RFC would be soliciting input from more people, on internals.rust-lang.org or in an issue?

@udoprog
Copy link
Contributor Author

udoprog commented Feb 5, 2018

@sunfishcode
Copy link
Member

A tricky case to consider here is -0.0. Converting it to integer and back will produce +0.0, which will compare equal to -0.0, making the TryFrom implementation here succeed, even though the conversion was lossy. I don't yet have an opinion about what to do about this.

@BatmanAoD BatmanAoD added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 13, 2018
@BatmanAoD
Copy link
Member

Per discussion in IRC and advice from @Mark-Simulacrum, I am tagging this as waiting-on-author.

@udoprog Please keep your eye on that rust-lang internals thread; if and when a consensus is reached and you've implemented the desired functionality, ask for a re-review.

@arazabishov
Copy link
Member

@udoprog ping from the release team :) It seems that discussion on rust-lang internals stopped a while ago, any plans regarding this PR?

@udoprog
Copy link
Contributor Author

udoprog commented Feb 23, 2018

@arazabishov I'd propose closing this.

I might attempt to be resurrect this as a number of specialized methods for f32/f64 in the future. But there doesn't appear to be enough support for any one way to implement this through TryFrom.

@arazabishov
Copy link
Member

@udoprog closing this PR for now then.

@alecmocatta
Copy link
Contributor

I can certainly see a use case for this. I for one, spurred on by clippy, already use either From or TryFrom in place of as for almost all conversions. This avoids lossy casts masking bugs in my code.

Casting between floats and integers is the last bastion of as usage in my code. Unfortunately, it's also potentially UB as noted by OP.

TryFrom for floats would enable me to replace my_float.floor() as usize with a defined, safe, lossless conversion: my_float.floor().try_into().unwrap().


My personal design preferences:

TryFrom should not round. It should guarantee at least PartialEq::eq(T::try_from(U::try_from(x)?)?, x), and perhaps even be fully lossless, probably depending on whether -0.0 is specified to succeed or not. I have no strong feelings on that, though I do feel -0.0 shouldn't succeed if converting to an unsigned integer.

It should be implemented between f32 and f64 as well.

@Andlon
Copy link

Andlon commented Nov 16, 2021

I'm flabbergasted that this was closed. It seems obvious to me that TryFrom should never automatically do any kind of rounding, as this is too important of a decision to be "implicit". It seems to me that the conversion fulfills exactly the spirit of TryFrom: We try to convert, if such a conversion can be made, otherwise we fail. The intersection of representable numbers between floating-point numbers and integers is quite large (f64 can represent all of u32, for example).

I work with numerics, and there are absolutely many use cases for fallible exact conversion from integer to float or vice versa. Let me give my most recent (real-world) motivating example:

We're parsing a matrix stored in some format. We expect the matrix to be an integer matrix, but the format doesn't clearly distinguish between integers and floating-point values (perhaps even stored as 1.0, 2.0 and so on), so it might need to be parsed as floating-point values, then possibly converted to integers later. This is a textbook example for TryFrom.

I urge you to reconsider this issue: in my mind the right thing to do here truly is unambiguous: if an integer can be exactly represented as a floating-point value, then convert, otherwise fail. And vice versa for float -> int.

@DanteMarshal
Copy link

Any progress on this ? Any new issues to continue the discussion on ?

I also have a use-case for exact matching floats to integers in my project. I had to define a whole new Trait for this to work in my crate, and implement it for floats and numbers as below in a macro :

impl TryConvert<TFlt> for TNum {
    fn try_convert(self) -> Result<TFlt, ConvertionError> {
        let ret = self as TFlt;
        if self == (ret as TNum) {
            Ok(ret)
        } else {
            Err(ConvertionError::out_of_precision::<TNum, TFlt>(self, ret))
        }
    }
}

The documentation for try_from on other numeric types states :

Try to create the target number type from a source number type. This returns an error if the source value is outside of the range of the target type.

We can argue that "The range of the target type" for an integer is an exact integer, which means the target should not accept non-integers values even if they reside within the MIN..MAX range of the type, which includes something like 0.5f32 or even 8.00001f64.
If the user wants to round the float to an integer or even just apply the rounding only if they're closer than a specific threshold, then they should either use the round method or implement their own conversion method since that is not only more specific but also depends on an external parameter, namely the threshold at which they want to round the float.
A clearly better expected behavior for TryFrom is the case of an exact match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.