RFC: `int` semantics #1

cometkim · 2024-11-18T16:36:42Z

cometkim · 2024-11-18T16:44:29Z

This is still TBD, but the suggested actions to the compiler are:

Fix the assumption broken by FFIs. We need to explicitly do fromNumber when referencing external sources, like this:
```
@val external extNumber: int = "extNumber"

let fromExt = extNumber // here
```
Change the implementation of multiply from Math.imul to fromNumber
Add Overflow_value built-in exception to resolve ambiguity of abs(min_value)

Any early feedback and additional suggestions are welcome :)

cristianoc · 2024-11-19T06:01:31Z

Have you measured the perf implications of changing multiplication from the current one?
Also x+y needs similar changes if it wants to satisfy the definition.

shulhi · 2024-11-19T06:03:02Z

I don't think it's a good idea to implicitly convert infinity to min/max value of int. Infinity has a different semantic compared to the min/max bound of int.

cometkim · 2024-11-19T07:02:31Z

Infinity has a different semantic compared to the min/max bound of int.

So does 0, and more problematic. The original resolution I mentioned in the issue was about stdlib, so there will be more suggestion about it into the "API Consideration" section. Something like, "consider using option on the conversion API like Float.toInt"

cometkim · 2024-11-19T16:13:37Z

Have you measured the perf implications of changing multiplication from the current one?

It depends on the environment.

In my measurements, in Node.js, inlined mul (multiply & truncate) is slightly faster than Math.imul. Both are so fast that it doesn't seem to mean much, but on an amd64 Linux machine, inlined mul is 8% faster.

I ran it on quickjs to see the impact of the optimization. The results are 30% faster for inlined mul. This implies that optimizations by simple inlining are more achievable than by JIT.

On the other hand, in Bun, Math.imul and mul are identical, but when I run on a non-inlined function, it's 45% faster than others.

suite

import { summary, run, bench } from './node_modules/mitata/src/main.mjs';

const min_value = -(2 ** 31);
const max_value = 2 ** 31 - 1;

function* randomIntegers(n) {
  for (let i = 0; i < n; i++) {
    yield Math.floor(Math.random() * 2 ** 32) + min_value;
  }
}

const length = 1000;
const x = [...randomIntegers(length)];
const y = [...randomIntegers(length)];

summary(() => {
  bench('Math.imul', () => {
    for (let i = 0; i < length; i++) {
      void Math.imul(x[i], y[i]);
    }
  });

  // function mul(x, y) {
  //   return x * y | 0;
  // }
  // bench('mul', () => {
  //   for (let i = 0; i < length; i++) {
  //     mul(x[i], y[i]);
  //   }
  // });

  bench('mul  (inline)', () => {
    for (let i = 0; i < length; i++) {
      void (x[i] * y[i] | 0);
    }
  });
});

await run();

cristianoc · 2024-11-19T17:07:04Z

Have you measured the perf implications of changing multiplication from the current one?

It depends on the environment.

In my measurements, in Node.js, inlined mul (multiply & truncate) is slightly faster than Math.imul. Both are so fast that it doesn't seem to mean much, but on an amd64 Linux machine, inlined mul is 8% faster.

I ran it on quickjs to see the impact of the optimization. The results are 30% faster for inlined mul. This implies that optimizations by simple inlining are more achievable than by JIT.

On the other hand, in Bun, Math.imul and mul are identical, but when I run on a non-inlined function, it's 45% faster than others.

suite

This is great though I now don't understand what "Change the implementation of multiply from Math.imul to fromNumber" means. Concretely the compilation of (x,y) => x*y should change from what generated code to what generated code?

cometkim · 2024-11-20T05:04:02Z

Have you measured the perf implications of changing multiplication from the current one?

It depends on the environment.
In my measurements, in Node.js, inlined mul (multiply & truncate) is slightly faster than Math.imul. Both are so fast that it doesn't seem to mean much, but on an amd64 Linux machine, inlined mul is 8% faster.
I ran it on quickjs to see the impact of the optimization. The results are 30% faster for inlined mul. This implies that optimizations by simple inlining are more achievable than by JIT.
On the other hand, in Bun, Math.imul and mul are identical, but when I run on a non-inlined function, it's 45% faster than others.
suite

This is great though I now don't understand what "Change the implementation of multiply from Math.imul to fromNumber" means. Concretely the compilation of (x,y) => x*y should change from what generated code to what generated code?

let f = (a, b) => a * b

currently emit this code

function f(a, b) {
  return Math.imul(a, b);
}

I'm requesting to change it to this code

function f(a, b) {
  return a * b | 0;
}

just like we do for add

cometkim · 2024-11-20T05:06:40Z

Since both a and b is type-checked, I expect fromNumber(x) outputs x | 0.

Runtime dispatch is required when it is outside the compiler's capabilities.

cristianoc · 2024-11-20T05:44:54Z

OK I think the entire RFC can be written in a few lines if one defines operations using | 0 and only comments and explains on the deviations from this.
For FFI, +-Infinity is treated specially, for division/mod one has div0 check, for abs there's an exception.
Then one can discuss the reason for these special treatments and comment on differences w.r.t. how things are done in master.

cristianoc · 2024-11-20T08:55:15Z

For example, in the case of float to int conversion, one can have the baseline (num) => num | 0 and only discuss the deviations from this baseline:
1 NaN: currently no deviation suggested. That would mean NaN goes to 0. I doubt that's the intention and it should to go None instead. (Or an exception, but let's ignore the distinction here).
2 Infinity: current suggestion is to go to max/min int32. That is one possibility. The other possibility is to use None. What comes to mind is +Infinity + 1, whether it should go to min_int (because max_int+1) or to None/exception.

cometkim · 2024-11-20T16:06:22Z

Oh sorry, I had to focus on explaining the current behavior as it is before making changes 😞

cometkim · 2024-11-20T16:18:10Z

1 NaN: currently no deviation suggested. That would mean NaN goes to 0. I doubt that's the intention and it should to go None instead. (Or an exception, but let's ignore the distinction here).

IMO we can leave it as the responsibility of stdlib.

2 Infinity: current suggestion is to go to max/min int32. That is one possibility. The other possibility is to use None. What comes to mind is +Infinity + 1, whether it should go to min_int (because max_int+1) or to None/exception.

As @glennsl originally pointed out, Infinity and NaN are not included in int semantics, which would invalidate reasons for having int type. So does None

My intention in the fromNumber definition was not to try representing the Infinity value, but to select a more useful "garbage value" in the integer range. If that doesn't work, then we should return to 0.

cometkim · 2024-11-20T16:20:21Z

I added the Overflow_value to the abs definition because I assumed there would be no compatibility issues since it is a primitive that does not currently exist. If it doesn't work universally, it can also be left out.

cristianoc · 2024-11-20T16:46:22Z

1 NaN: currently no deviation suggested. That would mean NaN goes to 0. I doubt that's the intention and it should to go None instead. (Or an exception, but let's ignore the distinction here).

IMO we can leave it as the responsibility of stdlib.

2 Infinity: current suggestion is to go to max/min int32. That is one possibility. The other possibility is to use None. What comes to mind is +Infinity + 1, whether it should go to min_int (because max_int+1) or to None/exception.

As @glennsl originally pointed out, Infinity and NaN are not included in int semantics, which would invalidate reasons for having int type. So does None

My intention in the fromNumber definition was not to try representing the Infinity value, but to select a more useful "garbage value" in the integer range. If that doesn't work, then we should return to 0.

Actually I was referring to the stdlib function to convert float to int.
And asking the question of: on what values it should return a result (and what result), otherwise return None or an exception.

Then similar questions for all the operators.

cometkim · 2024-11-20T17:12:25Z

Actually I was referring to the stdlib function to convert float to int.
And asking the question of: on what values it should return a result (and what result), otherwise return None or an exception.

Then I think stdlib should always use option for any kind of float to int conversions, but it seems not possible without making breaking changes to existing interfaces of Core/Belt/Js, or we have to add another migration requirement since v12. I'm curious to hear @zth's thoughts on this.

zth · 2024-11-21T14:24:07Z

The way we've dealt with these things before is to have 2 or more variants in the stdlib. In this case, we could for example have toIntOpt return an option if it's NaN, and toInt raise if it's NaN.

cristianoc · 2024-11-21T14:32:47Z

The way we've dealt with these things before is to have 2 or more variants in the stdlib. In this case, we could for example have toIntOpt return an option if it's NaN, and toInt raise if it's NaN.

That makes sense.
@cometkim what do you think is better for +/- infinity: return None/exception or max/min int?

cometkim · 2024-11-21T14:37:38Z

Using result for detail errors for each case is too much?

cometkim · 2024-11-21T16:06:07Z

For example, we can have Safe version APIs as a module

type errorKind = [#Overflow_value]

exception ConversionError(errorKind)

let toInt: float => int // raise ConversionError

module Safe = {
  let toInt: float => result<int, errorKind>
}

We could also create both Safe and Unsafe modules and provide a flag to open one by default.

cometkim · 2024-11-21T16:19:01Z

what do you think is better for +/- infinity: return None/exception or max/min int?

If there is an alternative API, then assuming conversion Infinity to int is done at least via the API, and users do not use primitives directly unless they are intended.

I prefer using result type in this case too. If you think that is overkill, None is fine. That is also the choice of many existing libraries like tablecloth.

cristianoc · 2024-11-22T10:42:42Z

what do you think is better for +/- infinity: return None/exception or max/min int?

If there is an alternative API, then assuming conversion Infinity to int is done at least via the API, and users do not use primitives directly unless they are intended.

I prefer using result type in this case too. If you think that is overkill, None is fine. That is also the choice of many existing libraries like tablecloth.

I guess in this case, it f there are 2-3 kinds of errors, result makes sense to document what they are.
I would consider as payload a variant, whose cases have identical names to the corresponding exceptions that are raised by the other conversion function that raises instead of returning Error.

cristianoc · 2024-11-22T10:46:12Z

what do you think is better for +/- infinity: return None/exception or max/min int?

If there is an alternative API, then assuming conversion Infinity to int is done at least via the API, and users do not use primitives directly unless they are intended.

I prefer using result type in this case too. If you think that is overkill, None is fine. That is also the choice of many existing libraries like tablecloth.

I guess in this case, it f there are 2-3 kinds of errors, result makes sense to document what they are.
I would consider as payload a variant, whose cases have identical names to the corresponding exceptions that are raised by the other conversion function that raises instead of returning Error.

So if one uses the version returning result one would do eg

| Error(NaN) => ...

and when converting to using the function that raises:

| exception NaN => ...

cometkim · 2024-11-24T09:23:15Z

what do you think is better for +/- infinity: return None/exception or max/min int?

If there is an alternative API, then assuming conversion Infinity to int is done at least via the API, and users do not use primitives directly unless they are intended.
I prefer using result type in this case too. If you think that is overkill, None is fine. That is also the choice of many existing libraries like tablecloth.

I guess in this case, it f there are 2-3 kinds of errors, result makes sense to document what they are. I would consider as payload a variant, whose cases have identical names to the corresponding exceptions that are raised by the other conversion function that raises instead of returning Error.

One minor problem is that if we introduce this API style for float to int conversion, it will be expected for other types of conversion like string to int.

Then we'll need more error definitions per conversion. We need to propose a module convention to split these errors.

draft RFC for int semantics

4bd7186

cometkim self-assigned this Nov 18, 2024

assign a PR number

6b58c97

add restriction to avoid Math.imul

5534179

add a future consideration

a3aa732

cometkim mentioned this pull request Nov 22, 2024

Upgrade exponentiation to unified operator rescript-lang/rescript#7153

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: `int` semantics #1

RFC: `int` semantics #1

cometkim commented Nov 18, 2024 •

edited

Loading

cometkim commented Nov 18, 2024

cristianoc commented Nov 19, 2024

shulhi commented Nov 19, 2024

cometkim commented Nov 19, 2024

cometkim commented Nov 19, 2024 •

edited

Loading

cristianoc commented Nov 19, 2024

cometkim commented Nov 20, 2024

cometkim commented Nov 20, 2024 •

edited

Loading

cristianoc commented Nov 20, 2024

cristianoc commented Nov 20, 2024 •

edited

Loading

cometkim commented Nov 20, 2024

cometkim commented Nov 20, 2024

cometkim commented Nov 20, 2024

cristianoc commented Nov 20, 2024

cometkim commented Nov 20, 2024

zth commented Nov 21, 2024

cristianoc commented Nov 21, 2024

cometkim commented Nov 21, 2024

cometkim commented Nov 21, 2024

cometkim commented Nov 21, 2024

cristianoc commented Nov 22, 2024

cristianoc commented Nov 22, 2024

cometkim commented Nov 24, 2024

RFC: int semantics #1

Are you sure you want to change the base?

RFC: int semantics #1

Conversation

cometkim commented Nov 18, 2024 • edited Loading

cometkim commented Nov 18, 2024

cristianoc commented Nov 19, 2024

shulhi commented Nov 19, 2024

cometkim commented Nov 19, 2024

cometkim commented Nov 19, 2024 • edited Loading

cristianoc commented Nov 19, 2024

cometkim commented Nov 20, 2024

cometkim commented Nov 20, 2024 • edited Loading

cristianoc commented Nov 20, 2024

cristianoc commented Nov 20, 2024 • edited Loading

cometkim commented Nov 20, 2024

cometkim commented Nov 20, 2024

cometkim commented Nov 20, 2024

cristianoc commented Nov 20, 2024

cometkim commented Nov 20, 2024

zth commented Nov 21, 2024

cristianoc commented Nov 21, 2024

cometkim commented Nov 21, 2024

cometkim commented Nov 21, 2024

cometkim commented Nov 21, 2024

cristianoc commented Nov 22, 2024

cristianoc commented Nov 22, 2024

cometkim commented Nov 24, 2024

RFC: `int` semantics #1

RFC: `int` semantics #1

cometkim commented Nov 18, 2024 •

edited

Loading

cometkim commented Nov 19, 2024 •

edited

Loading

cometkim commented Nov 20, 2024 •

edited

Loading

cristianoc commented Nov 20, 2024 •

edited

Loading