Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1e100 is considered convertible to int64 and the result is invalid #93

Closed
alaviss opened this issue Dec 4, 2021 · 10 comments · Fixed by #157
Closed

1e100 is considered convertible to int64 and the result is invalid #93

alaviss opened this issue Dec 4, 2021 · 10 comments · Fixed by #157

Comments

@alaviss
Copy link
Contributor

alaviss commented Dec 4, 2021

Example

This was extracted from tcompiletime_range_checks.nim:45, where the declaration should have produced an error but didn't.

echo int64(1e100)

Current Output

0

Expected Output

A compile-time error

Possible Solution

There are two parts to this issue:

First is this conditional:

elif src.kind in nkFloatLit..nkFloat64Lit and
(classify(src.floatVal) in {fcNan, fcNegInf, fcInf} or
src.floatVal.int64 notin firstOrd(c.config, targetTyp)..lastOrd(c.config, targetTyp)):
result = convNotInRange

Since this value won't fit in int64, converting become low(int64), which of course, is a valid int64 value (though do note that in C standard this is undefined). I think either we do the comparison in floats, or int128 it, which leads to this second portion:

of tyFloat..tyFloat64:
result = newIntNodeT(toInt128(getFloat(a)), n, idgen, g)

toInt128(1e100) produced 0 here, which is invalid, and also means we can't use int128 to perform our checks.

Tagging @krux02 here, as he is the author of this code. I'm not sure what is the best to do here, though I think we should code in a range check for numbers that won't fit in int128 (as a fallback in case the check prior fails).

Additional Information

  • Found while testing nimskull on ARM machines
$ nim -v
Nim Compiler Version 1.6.0 [Linux: amd64]
Copyright (c) 2006-2021 by Andreas Rumpf

Source hash: ffb7dc7223ff0eb6768add1e065e406af3397ed6
Source date: 2021-12-03

active boot switches: -d:release
@krux02
Copy link
Contributor

krux02 commented Dec 5, 2021

diff --git a/compiler/int128.nim b/compiler/int128.nim
index ee88be97c..efe890d13 100644
--- a/compiler/int128.nim
+++ b/compiler/int128.nim
@@ -3,7 +3,7 @@
 ## hold all from ``low(BiggestInt)`` to ``high(BiggestUInt)``, This
 ## type is for that purpose.
 
-from math import trunc
+from math import trunc, pow
 
 type
   Int128* = object
@@ -541,7 +541,14 @@ proc ldexp(x: float64, exp: cint): float64 {.importc: "ldexp", header: "<math.h>
 
 template bitor(a,b,c: Int128): Int128 = bitor(bitor(a,b), c)
 
+proc inInt128Range*(arg: float64): bool =
+  const lowerBound = -pow(2'f64, 127) 
+  const upperBound = +pow(2'f64, 127) # 2^127-1 can't be represented with float64 precision
+  lowerBound <= arg and arg < upperBound
+
 proc toInt128*(arg: float64): Int128 =
+  assert(inInt128Range(arg), "out of range")
+
   let isNegative = arg < 0
   let v0 = ldexp(abs(arg), -100)
   let w0 = uint64(trunc(v0))
diff --git a/compiler/semexprs.nim b/compiler/semexprs.nim
index e3defe03a..9e82902e8 100644
--- a/compiler/semexprs.nim
+++ b/compiler/semexprs.nim
@@ -156,10 +156,11 @@ proc checkConvertible(c: PContext, targetTyp: PType, src: PNode): TConvStatus =
       if src.kind in nkCharLit..nkUInt64Lit and
           src.getInt notin firstOrd(c.config, targetTyp)..lastOrd(c.config, targetTyp):
         result = convNotInRange
-      elif src.kind in nkFloatLit..nkFloat64Lit and
-          (classify(src.floatVal) in {fcNan, fcNegInf, fcInf} or
-            src.floatVal.int64 notin firstOrd(c.config, targetTyp)..lastOrd(c.config, targetTyp)):
-        result = convNotInRange
+      elif src.kind in nkFloatLit..nkFloat64Lit:
+        if not src.floatVal.inInt128Range:
+          result = convNotInRange
+        elif src.floatVal.toInt128 notin firstOrd(c.config, targetTyp)..lastOrd(c.config, targetTyp):
+          result = convNotInRange
     elif targetBaseTyp.kind in tyFloat..tyFloat64:
       if src.kind in nkFloatLit..nkFloat64Lit and
           not floatRangeCheck(src.floatVal, targetTyp):
diff --git a/todo.org b/todo.org
index 51576fa2a..28a152335 100644

@krux02
Copy link
Contributor

krux02 commented Dec 9, 2021

here is more inconsistency:

proc foo1(arg: float64): string =
  let x = int64(arg)
  $x

proc foo2(arg: float64): string {.compileTime.} =
  let x = int64(arg)
  $x

echo foo1(1e100)
echo foo2(1e100)
echo int64(1e100)

output:

-9223372036854775808
-9223372036854775808
0

@zacharycarter
Copy link
Contributor

I'm currently working on replacing the compiler's int128 implementation with a signed and unsigned int128 implementation. I know we want to revisit the design of the compiler overall, but for now I think providing a better implementation will make these casts easier and understandable and also simpler to spec. Once I have finished the implementations, and the existing tests are passing I will add new ones for the float64 to int128.

@krux02
Copy link
Contributor

krux02 commented Jan 13, 2022

@zacharycarter I've written the int128 type and of course I am biased here, but I don't think the root cause of this bug here is the int128 type or its implementation. I've worked a lot on this type, I know how it works, and I've written a lot of integer handling with different sizes in the VM. There is certainly some shared knowledge between this type and the VM type. Reimplementing int128 to a signed int128 type (sign + absolute value) would remove all of it and therefore I don't like the idea.

So I would really like to unstand your problems with the int128 type as it is.

Yes there are bugs, but they are to my knowledge not because of the int128 type, they are because of a problematic specification of Nim and developers pulling the language in different directions.

My suggestion for this fix would be the following:

  • Make damn sure that compile time and runtime behaves exactly the same.

This means int64(1e100) would become -9223372036854775808, because that is what it is at runtime, at compiletime (VM) and that can be implemented in semfold as well. Yes this means less error detection at semfold.

@alaviss
Copy link
Contributor Author

alaviss commented Jan 13, 2022

This means int64(1e100) would become -9223372036854775808, because that is what it is at runtime, at compiletime (VM) and that can be implemented in semfold as well.

Note that this is not guaranteed. Nim does not have float range check so the behavior varies wildly between architectures. We should look into defining runtime checks for float conversions as well.

@zacharycarter
Copy link
Contributor

@zacharycarter I've written the int128 type and of course I am biased here, but I don't think the root cause of this bug here is the int128 type or its implementation. I've worked a lot on this type, I know how it works, and I've written a lot of integer handling with different sizes in the VM. There is certainly some shared knowledge between this type and the VM type. Reimplementing int128 to a signed int128 type (sign + absolute value) would remove all of it and therefore I don't like the idea.

So I would really like to unstand your problems with the int128 type as it is.

@krux02 - thanks for the reply my dude! I can't say I necessarily have a problem with the implementation, I just don't understand parts of it, and since I needed some frame of reference to test my findings against, I thought the best way would be to port some implementation I knew that was correct and that I could test against.

Yes there are bugs, but they are to my knowledge not because of the int128 type, they are because of a problematic specification of Nim and developers pulling the language in different directions.

@krux02 - good to know! I can stop wasting my time here then xD

My suggestion for this fix would be the following:

  • Make damn sure that compile time and runtime behaves exactly the same.

This means int64(1e100) would become -9223372036854775808, because that is what it is at runtime, at compiletime (VM) and that can be implemented in semfold as well. Yes this means less error detection at semfold.

Okay, I see - thanks for the explanation. I don't know anything about the compiler backend really other than it's a compiler and I have uni textbook knowledge of them. I've barely done anything with Nim's, but I'll look into it.

This means int64(1e100) would become -9223372036854775808, because that is what it is at runtime, at compiletime (VM) and that can be implemented in semfold as well.

Note that this is not guaranteed. Nim does not have float range check so the behavior varies wildly between architectures. We should look into defining runtime checks for float conversions as well.

Okay first I will check out what Arne is suggesting regarding semfold and then I'll look into this.

@saem
Copy link
Collaborator

saem commented Jan 14, 2022

For those of us peripherally following along, it seems there are two lessons here:

  1. literal typing/conversion is broken, we should track width and detect narrowing
  2. VM math needs tighter specification, see below

The direction with VM math operations in the compiler/VM, especially floating point need to be agnostic of the compiler host platform.

Instead they should work as per some sort of canonical understanding of the target back-end or a single agreed upon understanding.

That way when I compile code on some funky CPU, all the compile time evaluation should be identical to some more popular CPU/mArchitecture for the same code and parameters.

This especially matters for floating point numbers, very wide types that are rarely natively supported (128bit integers), or targets that treat don't have real int support (JS).


That sounds about right?

@zacharycarter
Copy link
Contributor

That sounds about right?

It does but I'm still not sure what to do here in order to make any progress. I think I'm in slightly over my head 😅

I might try to find something simpler to work on - lately I haven't had much time to work on anything and I don't want to commit to something I barely understand to begin with.

@krux02
Copy link
Contributor

krux02 commented Jan 15, 2022

Okay, I see - thanks for the explanation. I don't know anything about the compiler backend really other than it's a compiler and I have uni textbook knowledge of them. I've barely done anything with Nim's, but I'll look into it.

Well I can give you a brief overview of how Nim works here. There are three ways to compute an expression in Nim:

  1. compile to C/C++/JS and execute at runtime
  2. compile to byecode and execute at compile time
  3. semfold

I think option 1 and two are pretty much clear. Not in detail, but generally enough for this conversation. Semfold though is a different thing. It is just a compiler pass that does ast substitutions for simple expressions that can be computed at compile time. This type of code execution is usually implemented in compilers (such as C) as an optimization pass, so that (1+5) becomes a literal 6. I think c++ template metaprogramming is based on something like this. But in Nim, semfold is completely redundant to the VM (and the semfold equivalent in the backend compiler). Except from the part where semfold does error generation for out of range values.

Currently I don't know what to do or if my suggestion is still a good suggestion. After all, maybe it is good if the convesion of one googol (yes 1e100 has a name) to int64 would raise an error at runtime and VM as well.

@alaviss
Copy link
Contributor Author

alaviss commented Jan 15, 2022

Right now the only defined rules in Nim are:

  • intX <-> intY is range-checked.
  • intX -> uintY is not range-checked and is a bit-reinterpretation iirc.
  • uintX -> intY is range-checked.
  • floatX -> intY is undefined if the floatX with the part after the . dropped cannot be represented by intY (inherited from C).
  • intX -> floatX produces the nearest integer representable by the target floating point type (inherited from C). The resulting float might be larger or smaller than intX itself.

I think the right thing to do here is to make floatX -> intY defined by enforcing range check for it across runtime and compile time.

bors bot added a commit that referenced this issue Jan 28, 2022
157: use improved range check of 128 for float64 r=alaviss a=krux02

 * Add a range check function to test if a float64 is in rage of the `int128` type.
 * Use that range check function for `semexpr.checkConvertible`
 * fixes #93

While this fixes the original issue mentioned above, it does not fix or implement the discussions in that issue. They need to be ported to their own issue.

Co-authored-by: Arne Döring <arne.doering@gmx.net>
@bors bors bot closed this as completed in 1cc5367 Jan 28, 2022
@bors bors bot closed this as completed in #157 Jan 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants