`std/int128s` and `type Duration = distinct int128` #399

timotheecour · 2021-07-16T00:20:44Z

proposal

add a std/int128s module containing 2 types: int128 and uint128

The API would be similar to std/jsbigints (https://nim-lang.github.io/Nim/jsbigints.html)

example use case 1: `type Duration = distinct int128`

one perfect use case is for representing Duration. Duration currently uses a sub-optimal design

type Duration* = object
  seconds: int64
  nanosecond: NanosecondRange

which forces every operation to be more costly than needed, and has underlying invalid bit representations. Using type Duration = distinct int128 would be much more natural, and efficient.

I consider this RFC as a pre-requisit for #383 (APIs should have typesafe self-documenting time units, eg proc sleep(t: Duration), allowing sleep(1.millisecond) etc

Note: whether Duration can be re-defined as type Duration = distinct int128 is TBD, but even if backward compatibility prevents it, we can still define std/durations defining type TimeUnit = distinct int128 and use it throughout stdlib to represent durations in APIS (ie, for #383); and std/times can use a conversion TimeUnit <=> Duration

example use case 2: compiler/int128 could reuse std/int128s

(and benefit from performance when native type is used)

why not use std/bigints instead?

std/bigints is also tracked and should eventually make its way in stdlib, but is no replacement for int128, for several reasons:

int128 will always be more performant than std/bigints to represent 128 ints
std/bigints involves much more dependencies, design tradeoffs compared to int128 which has a straightforward implementation
int128 has native support for most C compilers

implementation

most C compilers support this natively.

for C backend with a C compiler that supports __int128, wrap native operators (+, / etc) via importc procs; backend support can be done via #ifdef __SIZEOF_INT128__ (refs https://stackoverflow.com/questions/16088282/is-there-a-128-bit-integer-in-gcc/54815033#54815033); the check can be done at CT without having to use a custom flag (simpler than what we did with NIM_EmulateOverflowChecks)
for other C compilers, fallback to nim implementation (which we already have in compiler/int128.nim and which can be moved to lib/std/private/int128s)
for js backend, also fallback to that same nim implementation, or a wrapper around std/jsbigints or a wrapper around BigInt directly, whichever is simpler/faster, to maintain wraparound semantics, high, low etc
for VM, use vmops

Note, I've already implemented a lot of this, and it's just as easy to use as native types (eg int64) on the client side

importc for operators

std/jsbigints can write this:

func `+`*(x, y: JsBigInt): JsBigInt {.importjs: "(# $1 #)".}

but that doesn't work with importc (it requires importcpp, which requires cpp backend).
What i did to circumvent this is to use an auxiliary .h file containing macros which i can then importc and it all works, but in future work this should be improved so that we can use the much simpler and direct way:

func `+`*(x, y: int128): int128 {.pattern: "# $1 #".}

refs #315 (comment) which shows what should be done (that RFC needs to be updated to define pattern instead of improving importjs)

current unknown: what should be the type kind?

right now I'm using:

type int128* {.importc:"__int128_t", nodecl.} = object

this works (in particular has value semantics), but what's unclear is whether it should be "object" or something else on the RHS; it doesn't follow what the other integral types do (tyInt64 etc).

this works, but what's unclear is whether it should be "object" or something else on the RHS.
for eg, in js backend this is what's used instead of object:

type JsBigIntImpl {.importjs: "bigint".} = int
type JsBigInt* = distinct JsBigIntImpl

another possibility would be to bake the type in the compiler as was done with tyInt64 etc, by introducing tyInt128 and tyUInt128 but the obvious downside is this would increase compiler complexity
yet another possibility is to introduce tyInteger (all ints) or tyNumber (all numbers) to also represent float in extended precision (float80), quadruple floats (float128, aka long double), or half-float (float16) using 1 type kind, or at least for all new such types (excluding tyInt32 etc)

bigint vs 128 performance

https://news.ycombinator.com/item?id=10909909

Optimization story: Switching from GMP to gcc's __int128 reduced run time by 95%

https://stackoverflow.com/questions/16088282/is-there-a-128-bit-integer-in-gcc#comment116127623_16088370

having an arbitrary-precision library just for a 128-bit type is just too wasteful, and the overhead is just too big. A fixed-width library like Boost.Multiprecision or calccrypto/uint128_t will be much smaller and faster – phuclv Jan 12 at 9:24

links

D20210318T153249
type TimeUnit = distinct int128 timotheecour/Nim#762 which was a draft of this RFC; see some implementation discussion here: type TimeUnit = distinct int128 timotheecour/Nim#762 (comment)
APIs should have typesafe self-documenting time units, eg proc sleep(t: Duration), allowing sleep(1.millisecond) etc #383 (comment)
https://github.com/status-im/nim-stint/blob/master/stint.nim has multiprecision library, including a Int128, so this is a candidate thing to compare against

func i128*(n: SomeInteger): Int128 {.inline.} = n.stint(128)
func i128*(s: string): Int128 {.inline.} = s.parse(Int128)

The text was updated successfully, but these errors were encountered:

konsumlamm · 2021-07-16T00:45:19Z

The only nit I have: do we really have to suffix every module name with s? Sometimes, like in this case, that just doesn't make any sense, std/int128 would be way more natural imo.

timotheecour · 2021-07-16T01:03:07Z

module names where a symbol of same name exists causes issues though, and the s also reminds that there is not just int128 but also uint128.

module-symbol name clash example

(hence nim-lang/Nim#15590)

# in std/int128:

type int128* = object
  data: array[2, uint64] # fallback impl

proc high*[T: int128](_: typedesc[T]): int128 = int128(data: [uint64.high, uint64.high])
  # value not correct but that's beside the point
  # (also besides the point is whether high should be defined)

# main.nim:
import std/int128
echo int128.high

would give: Error: type mismatch: got <proc (_: typedesc[T: int128]): int128>
of course you could use import std/int128 as foo but that defeats the purpose.

in other cases it could be even more error prone, silently giving un-intended results.

using std/int128s just solve all such problems and is already a widely used convention.

Varriount · 2021-07-16T04:54:26Z

My only opinion on this is that I rather just have this type present in system.nim (or at least, imported implicitly).

Araq · 2021-07-16T11:03:50Z

My only opinion on this is that I rather just have this type present in system.nim (or at least, imported implicitly).

No...

rockcavera · 2021-07-16T13:10:57Z

I've always been in favor of having int128/uint128 as built-in types, but nothing against them being a std package.

I also wrote a 128-bit integer nimble package, nint128. I decided to write it because I believe it is possible to do more optimizations on a single type of integer than using arbitrary integers. In it I tried to write everything in pure Nim, with some optimizations using __int128 or intrinsics, when possible and really favoring the code.

I'm currently evolving the package to be fully pure Nim with the user being able to determine which operators should use __int128 (C extension for 128-bit integers, which is supported by GCC and CLANG) or intrinsic functions for 128-bit VCC.

Varriount · 2021-07-23T21:43:15Z

My only opinion on this is that I rather just have this type present in system.nim (or at least, imported implicitly).

No...

Why? I understand why things like, say, sequtils aren't in system.nim - it puts increased load on the compiler, and makes resolving name clashes harder - but I don't see why basic data types must be excluded.

Araq · 2021-07-23T22:09:06Z

"Basic data types" don't mean anything to me, we lived for 10 years without int128 but once added it's "basic"? Would a BigInt also be "basic" then? Instead we should make system.nim smaller, not bigger. It's pointless to optimize for the case "what can you write without an import statement" as already no real program out there can be written without an import.

c-blake · 2021-07-26T13:42:21Z

I am in favor of a "high resolution value dense numeric" time type. I am also in favor of some int128s in the stdlib (for hashes, EDIT: linkified random numbers, and various other circumstances it can clean things up where a nimble dep is a burden -- need not been in system to do so).

However, I think Time/Duration is a bad example of int128 because: 64-bits is plenty. Since there is a history of inadequate time address space allocation (y2k, 2038, etc.), I will elaborate a bit.

64-bits gets you 2**63./86400e9/365.25 = 292.27 years of nanoseconds. That is 1970 +- 292 = year 1678.. year 2262 or "from Newton to Star Trek" with a Unix epoch. Meanwhile, if you had a bigger time range than +-292 years, it is virtually certain you no longer care about nanosec resolution. After all, either it refers to a time before we could measure so accurately or to after a time we will likely be able to measure much more accurately. No one is comparing system or file times etc. across 3 centuries wanting {-, <, <=} in exact arithmetic to nanoseconds.

Meanwhile, only 53 bits of an IEEE double also get you 2**53./86400e6/365.25 = +-285 years of microseconds. It is also virtually certain that you very likely want float64 arithmetic in "big range" contexts, and already convert to float64 anyway. This also gracefully loses precision as range increases and gains it a little as range decreases - e.g. 2021-1970 = 51 years => only ~0.2 usec error. (This may matter if you convert to float64 before subtracting two high-res times, for example, but honestly it is still the case that not a whole lot happens in a few hundred nanoseconds.)

Actual change to time type representation should be discussed over in #383, though. I comment here because my points pertain more to the example.

JohnAD · 2021-08-27T01:48:33Z

Having 128-bit uint support would drastically improve the performance of any 128-bit decimal library that conforms to the IEEE standard as it does a lot of 113-bit arithmetic.

(This does remind me that I need to re-arrange my life somehow so that I can resume work on the decimal library.)

But whether it is me or someone else, it would be a huge performance boost on machines that happen to actually have CPU/APU/GPU 128-bit math support; which many modern ones do.

juancarlospaco · 2021-09-01T19:18:49Z

Stripe about int128, uint128: https://twitter.com/zenazn/status/1433134722809430019

This was referenced Jul 16, 2021

Split std/times nim-lang/Nim#18378

Closed

type TimeUnit = distinct int128 timotheecour/Nim#762

Open

APIs should have typesafe self-documenting time units, eg proc sleep(t: Duration), allowing sleep(1.millisecond) etc #383

Open

This was referenced Jul 16, 2021

add a section on naming new stdlib module names nim-lang/Nim#15590

Merged

better importjs: func fn(a: var int, b: int) {.importjs: "$2 = $3 + $2".} #315

Open

timotheecour mentioned this issue Aug 25, 2021

std/platformutils: platform reflection at CT: test whether platform has some feature (C apis, intrinsics, include files, etc) #414

Open

timotheecour mentioned this issue Aug 27, 2021

std/cputicks: high resolution low overhead strictly monotonic cpu counter #411

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`std/int128s` and `type Duration = distinct int128` #399

`std/int128s` and `type Duration = distinct int128` #399

timotheecour commented Jul 16, 2021 •

edited

Loading

konsumlamm commented Jul 16, 2021

timotheecour commented Jul 16, 2021 •

edited

Loading

Varriount commented Jul 16, 2021

Araq commented Jul 16, 2021

rockcavera commented Jul 16, 2021

Varriount commented Jul 23, 2021 •

edited

Loading

Araq commented Jul 23, 2021

c-blake commented Jul 26, 2021 •

edited

Loading

JohnAD commented Aug 27, 2021

juancarlospaco commented Sep 1, 2021 •

edited

Loading

std/int128s and type Duration = distinct int128 #399

std/int128s and type Duration = distinct int128 #399

Comments

timotheecour commented Jul 16, 2021 • edited Loading

proposal

example use case 1: type Duration = distinct int128

example use case 2: compiler/int128 could reuse std/int128s

why not use std/bigints instead?

implementation

importc for operators

current unknown: what should be the type kind?

bigint vs 128 performance

links

konsumlamm commented Jul 16, 2021

timotheecour commented Jul 16, 2021 • edited Loading

module-symbol name clash example

Varriount commented Jul 16, 2021

Araq commented Jul 16, 2021

rockcavera commented Jul 16, 2021

Varriount commented Jul 23, 2021 • edited Loading

Araq commented Jul 23, 2021

c-blake commented Jul 26, 2021 • edited Loading

JohnAD commented Aug 27, 2021

juancarlospaco commented Sep 1, 2021 • edited Loading

`std/int128s` and `type Duration = distinct int128` #399

`std/int128s` and `type Duration = distinct int128` #399

timotheecour commented Jul 16, 2021 •

edited

Loading

example use case 1: `type Duration = distinct int128`

timotheecour commented Jul 16, 2021 •

edited

Loading

Varriount commented Jul 23, 2021 •

edited

Loading

c-blake commented Jul 26, 2021 •

edited

Loading

juancarlospaco commented Sep 1, 2021 •

edited

Loading