initRand now uses strict monotonic counter to guarantee uniqueness #18149

timotheecour · 2021-06-02T05:52:18Z

note

~~the CI failure seems to suggest getMonoTime is in fact not monotonic on some OS? that's worrysome~~ (EDIT: that's expected; getCpuTicks would be strict monotonic at least on modern cpus)

future work

std/tempfiles should still not call initRand() on each call to randomPathName but should instead use a threadvar rand state, maybe (code doesn't look re-entrant)
docs should clarify whether getMonoTime can return same timestamp in 2 different threads; if so, then initRand should mix getMonoTime() with getThreadId() to guarantee uniqueness (EDIT: even within same thread there is no such guarantee because it's not strict monotonic; but getCpuTicks would be strict monotonic at least on modern cpus)
investigate std/monotimes not strictly monotonic on Linux_amd64 and windows #18158 (marking checkbox because issue has a tracker)
make initRand() work with --experimental:vmopsDanger
proc mach_absolute_time(): int64 {.importc, header: "<mach/mach.h>".} in std/monotimes has wrong signature, differing from one in system/timers (see https://developer.apple.com/documentation/kernel/1462446-mach_absolute_time)

lib/std/monotimes.nim

tests/stdlib/tmonotimes.nim

Varriount · 2021-06-03T00:46:22Z

Just curious: why use the current time, rather than pulling from /dev/random or /dev/urandom (*nix), or CryptGenRandom (Windows)?

timotheecour · 2021-06-03T00:52:26Z

Just curious: why use the current time, rather than pulling from /dev/random or /dev/urandom (*nix), or CryptGenRandom (Windows)?

well we now have std/sysrand instead of having to call those directly, so it's an option, but IIRC performance can be a concern, definitely worth trying though; also, we need a vmops for whichever option is used in the end so that initRand() can work at CT (minor, easy point). That said, a strict monotonic counter is always a good tool to have, regardless of availability of sysrand

tests/stdlib/tmonotimes.nim

Varriount · 2021-06-03T08:46:24Z

Well then, I propose using the system's source of randomness as a seed. If someone needs, for some odd reason, to initialize thousands of state machines using a faster method, they can do so manually.

Using a monotonic clock, even a strict one, doesn't necessarily guarantee unique values for each call. A random source of data may return the same value consecutively, but that can't be known ahead of time (assuming the system implementation is sound).

timotheecour · 2021-06-03T20:33:25Z

PTAL, addressed comment; this PR is an improvement over status quo.

Well then, I propose using the system's source of randomness as a seed. If someone needs, for some odd reason, to initialize thousands of state machines using a faster method, they can do so manually.

the benchmark below shows that using urandom from std/sysrand is 90x slower than getCpuTicks (more details later, this should guarantee uniqueness) and 20x slower than getMonoTime (from this PR) so urandom is not a good default IMO; user can always do that by calling initRand(getSeedFromUrandom)

import std/sysrand
import std/monotimes
import timn/exp/cputicks # cf upcoming PR

template algo1(buf, c) =
  let ok = urandom(buf)
  doAssert ok
  c += cast[int](buf[0].addr)

template algo2(buf, c) =
  let t = getCpuTicks()
  c += cast[int](t)

template algo3(buf, c) =
  let t = getMonoTime()
  c += cast[int](t.ticks)

template mainAux(algo)=
  let n = 10000
  var buf: array[8, byte]
  var c = 0
  let t1 = getCpuTicks()
  for i in 0..<n:
    algo(buf, c)
  let t2 = getCpuTicks()
  echo (astToStr(algo), c, t2 - t1)

proc main()=
  for i in 0..<10:
    echo()
    mainAux(algo1)
    mainAux(algo2)
    mainAux(algo3)
main()

Varriount · 2021-06-04T00:39:44Z

the benchmark below shows that using urandom from std/sysrand is 90x slower than getCpuTicks (more details later, this should guarantee uniqueness) and 20x slower than getMonoTime (from this PR) so urandom is not a good default IMO; user can always do that by calling initRand(getSeedFromUrandom)

So this seems to be an argument of performance vs correctness. Either we use a source of cryptographic randomness as the seed, or a high-resolution monotonic timer. Is this accurate?

timotheecour · 2021-06-04T00:42:41Z

So this seems to be an argument of performance vs correctness. Either we use a source of cryptographic randomness as the seed, or a high-resolution monotonic timer. Is this accurate?

yes, but i have a PR in the work that will give both performance and correctness; until then this PR is an improvement over status quo

Varriount · 2021-06-04T00:44:48Z

If there's a PR in the works for a proper fix, there's little point in a PR for an improper one, unless there's something time-critical going on.

timotheecour · 2021-06-04T00:54:12Z

If there's a PR in the works for a proper fix, there's little point in a PR for an improper one, unless there's something time-critical going on.

with this logic, nothing ever gets done (it's not the 1st time). This PR is good enough to close #17898 given the clock resolution and cost of initRand, and improves several other things if you read the PR content. The PR I have in the work could potentially be controversial, who knows (and the other things in this PR are still useful regardless); there's no point in blocking on it when reusing existing getMonoTime already improves status quo.

tests/stdlib/tmonotimes.nim

timotheecour · 2021-06-06T08:58:38Z

PTAL

Varriount · 2021-06-06T09:23:57Z

This does not fix #17898, it just makes it less likely.

Multiple decisions need to be made here:

To what degree should initRand prevent two calls in rapid succession from using the same seed?
Should randomPathName only generate names for paths that don't exist?

In my opinion, it's up to the caller of randomPathName to check that the path doesn't already exist. initRand, because it isn't typically called very often, can get away with sacrificing performance for uniqueness, however no specific guarantee should be made regarding how unique.

Varriount · 2021-06-11T06:11:10Z

I would also love to know why randomPathName is initializing a new random state each time. If you're going to do that, you might as well just use the current time anyway.

Araq · 2021-06-11T06:34:51Z

I would also love to know why randomPathName is initializing a new random state each time. If you're going to do that, you might as well just use the current time anyway.

I agree completely, this is not good.

timotheecour · 2021-07-01T07:55:13Z

3 possible venues:

use atomics as done in std/oids which solves a very similar problem; can be augmented with thread-id + skipRandomNumbers to be robust to multiple threads returning same value
use a {.threadvar.} Rand state
expose API getCpuTicks from RDTSC instruction wherever it's available (refs add getCpuTicks based on RDTSC instruction for highest-performance counters timotheecour/Nim#773, PR in the works), which is useful for many things (including as replacement for monotimes / cpuTime since it provides much higher resolution and lower overhead that those). This one is useful regardless of its use in this context. It provides guaranteed monotonicity within 1 thread (and by extension multiple threads since we can mix-in threadid; note that there's also an RDTSC instruction that returns the CPU id in addition to the tick). Portability is a potential concern but for benchmarking it's not a problem as it's an optional tool.

ringabout · 2021-08-22T04:06:16Z

This PR may not fix #17898 completely, but it could be a improvement for random module.

…ueness

timotheecour · 2021-08-23T23:18:12Z

@Araq PTAL:

EDIT: depends on fix RFCs/411: add std/cputicks #18743 (std/cputicks; see corresponding RFC in std/cputicks: high resolution low overhead strictly monotonic cpu counter RFCs#411); it introduces a new module std/cputicks providing cpu instruction-level granularity, with much higher precision and lower overhead than either times.cpuTime or std/monotimes (and less module import depdendencies), in particular strict monotonic instead of monotonic counters; this is what should be used for profiling (I'm using this in execution traces (eg for code coverage, debugging, introspection, profiling) #15827 in not-yet-pushed commits which adds full profiling capability) and micro benchmarks (as opposed to looping N times over some code, which can skew results in various ways, affect the cache, register allocation, etc) ; the APIs are available in all backends (c, js, vm)
initRand now works in VM
initRand now uses strict monotonic counter
improve monotimes (eg for js etc)

which also addresses the previously raised concerns

example

on OSX, getCpuTicks has 4.5X less overhead than getMonoTime() and 71X less overhead than cpuTime(), see https://gist.github.com/timotheecour/e5d85be09b141b4bf9f877e1c8731025 (-d:case1); in other OS's, the gap is even larger
on OS's other than OSX, getMonoTime() is not strictly monotonic and can't be used in a meaningful way to measure code under a certain number of instructions (see -d:case2)

future work

revisit fix #17898(randomPathName called twice in a row can return the same string on windows) #18729 by using instead the new std/cputicks

links

arnetheduck · 2021-08-24T06:53:48Z

lib/pure/random.nim

-    else:
-      let now = times.getTime()
-      result = initRand(convert(Seconds, Nanoseconds, now.toUnix) + now.nanosecond)
+    result = initRand(getCpuTicks())


this constrains the random module to work only on platforms where high-resolution tick counters are available - given that cputicks depends on non-standard behavior, it severly limits the platforms where Nim can be used - how to init rand is not a performance-critical operation - in fact, it would be trivial to continue using standardised C API for this without any significant loss - random is not a cryptographic random source, it's a best-effort proposition upon which no code that actually requires randomness should rely upon - the granularity doesn't not change the utility of the module for any use cases for which its use is appropriate.

it doesn't; fallback code can always be added to getCpuTicks to return something similar to what std/monotimes returns. rdtsc is available on all x86 processors since the pentium, and other platforms that nim supports have equivalent instructions which can be wrapped by getCpuTicks, see google/benchmark code here https://github.com/google/benchmark/blob/v1.1.0/src/cycleclock.h#L116 which handles more platform than nim supports.

and other platforms that nim supports have equivalent instructions which can be wrapped by getCpuTicks

Do you mean it will be implemented in the future? Then change the random.nim after they are implement I think. Anyway let's wait for #18743.

see also nim-lang/RFCs#414 which would allow testing from presence of __rdtsc programmatically, at CT

stale · 2022-09-08T23:52:02Z

This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions.

ringabout · 2022-09-23T02:51:22Z

It was done differently by #18744

timotheecour mentioned this pull request Jun 2, 2021

[std/times]getTime now uses high resolution API on windows #17901

Merged

timotheecour commented Jun 2, 2021

View reviewed changes

lib/std/monotimes.nim Outdated Show resolved Hide resolved

timotheecour force-pushed the pr_use_getMonoTime branch from 26470a9 to 2669a47 Compare June 2, 2021 06:05

timotheecour added the TODO: followup needed remove tag once fixed or tracked elsewhere label Jun 2, 2021

ringabout reviewed Jun 2, 2021

View reviewed changes

tests/stdlib/tmonotimes.nim Outdated Show resolved Hide resolved

timotheecour mentioned this pull request Jun 2, 2021

std/monotimes not strictly monotonic on Linux_amd64 and windows #18158

Closed

timotheecour force-pushed the pr_use_getMonoTime branch from 2669a47 to dbf6abc Compare June 2, 2021 21:33

timotheecour mentioned this pull request Jun 2, 2021

std/iterutils timotheecour/Nim#746

Open

timotheecour marked this pull request as ready for review June 2, 2021 22:23

timotheecour added the Ready For Review (please take another look): ready for next review round label Jun 2, 2021

Araq reviewed Jun 3, 2021

View reviewed changes

tests/stdlib/tmonotimes.nim Outdated Show resolved Hide resolved

timotheecour force-pushed the pr_use_getMonoTime branch from 1ec8222 to d3d1d96 Compare June 3, 2021 19:21

Araq reviewed Jun 4, 2021

View reviewed changes

tests/stdlib/tmonotimes.nim Show resolved Hide resolved

timotheecour force-pushed the pr_use_getMonoTime branch from d3d1d96 to 692b452 Compare June 6, 2021 08:55

ringabout changed the title ~~fix #17898 initRand now uses monotonic time to guarantee uniqueness~~ initRand now uses monotonic time to guarantee uniqueness Aug 22, 2021

ringabout approved these changes Aug 22, 2021

View reviewed changes

ringabout mentioned this pull request Aug 22, 2021

fix #17898(randomPathName called twice in a row can return the same string on windows) #18729

Merged

timotheecour added 7 commits August 23, 2021 13:45

refs nim-lang#17898

b8d04c2

_

2476060

fix nim-lang#17898 initRand now uses monotonic time to guarantee uniq…

e7c3591

…ueness

workaround refs nim-lang#18158

2fc0f9c

improve tests

7b314b7

fixup [skip ci]

2d80e5c

address comment

9df2889

timotheecour force-pushed the pr_use_getMonoTime branch from 692b452 to b2da97b Compare August 23, 2021 22:53

timotheecour changed the title ~~initRand now uses monotonic time to guarantee uniqueness~~ initRand now uses strict monotonic counter to guarantee uniqueness Aug 23, 2021

timotheecour added 3 commits August 23, 2021 17:06

add std/cputicks

169f90c

changelog

fbeff78

add test for initRand in VM

fa6bf6c

timotheecour force-pushed the pr_use_getMonoTime branch from 4fa63ea to fa6bf6c Compare August 24, 2021 00:08

timotheecour mentioned this pull request Aug 24, 2021

std/cputicks: high resolution low overhead strictly monotonic cpu counter nim-lang/RFCs#411

Open

fixup

de27fc6

arnetheduck reviewed Aug 24, 2021

View reviewed changes

timotheecour mentioned this pull request Aug 24, 2021

fix RFCs/411: add std/cputicks #18743

Closed

demotomohiro mentioned this pull request Aug 24, 2021

Fix initrand to avoid random number sequences overlapping #18744

Merged

stale bot added the stale Staled PR/issues; remove the label after fixing them label Sep 8, 2022

ringabout closed this Sep 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initRand now uses strict monotonic counter to guarantee uniqueness #18149

initRand now uses strict monotonic counter to guarantee uniqueness #18149

timotheecour commented Jun 2, 2021 •

edited

Loading

Varriount commented Jun 3, 2021

timotheecour commented Jun 3, 2021

Varriount commented Jun 3, 2021 •

edited

Loading

timotheecour commented Jun 3, 2021 •

edited

Loading

Varriount commented Jun 4, 2021

timotheecour commented Jun 4, 2021

Varriount commented Jun 4, 2021 •

edited

Loading

timotheecour commented Jun 4, 2021 •

edited

Loading

timotheecour commented Jun 6, 2021

Varriount commented Jun 6, 2021

Varriount commented Jun 11, 2021

Araq commented Jun 11, 2021

timotheecour commented Jul 1, 2021 •

edited

Loading

ringabout commented Aug 22, 2021

timotheecour commented Aug 23, 2021 •

edited

Loading

arnetheduck Aug 24, 2021 •

edited

Loading

timotheecour Aug 24, 2021

ringabout Aug 25, 2021 •

edited

Loading

timotheecour Aug 25, 2021

stale bot commented Sep 8, 2022

ringabout commented Sep 23, 2022

initRand now uses strict monotonic counter to guarantee uniqueness #18149

initRand now uses strict monotonic counter to guarantee uniqueness #18149

Conversation

timotheecour commented Jun 2, 2021 • edited Loading

note

future work

Varriount commented Jun 3, 2021

timotheecour commented Jun 3, 2021

Varriount commented Jun 3, 2021 • edited Loading

timotheecour commented Jun 3, 2021 • edited Loading

Varriount commented Jun 4, 2021

timotheecour commented Jun 4, 2021

Varriount commented Jun 4, 2021 • edited Loading

timotheecour commented Jun 4, 2021 • edited Loading

timotheecour commented Jun 6, 2021

Varriount commented Jun 6, 2021

Varriount commented Jun 11, 2021

Araq commented Jun 11, 2021

timotheecour commented Jul 1, 2021 • edited Loading

ringabout commented Aug 22, 2021

timotheecour commented Aug 23, 2021 • edited Loading

example

future work

links

arnetheduck Aug 24, 2021 • edited Loading

Choose a reason for hiding this comment

timotheecour Aug 24, 2021

Choose a reason for hiding this comment

ringabout Aug 25, 2021 • edited Loading

Choose a reason for hiding this comment

timotheecour Aug 25, 2021

Choose a reason for hiding this comment

stale bot commented Sep 8, 2022

ringabout commented Sep 23, 2022

timotheecour commented Jun 2, 2021 •

edited

Loading

Varriount commented Jun 3, 2021 •

edited

Loading

timotheecour commented Jun 3, 2021 •

edited

Loading

Varriount commented Jun 4, 2021 •

edited

Loading

timotheecour commented Jun 4, 2021 •

edited

Loading

timotheecour commented Jul 1, 2021 •

edited

Loading

timotheecour commented Aug 23, 2021 •

edited

Loading

arnetheduck Aug 24, 2021 •

edited

Loading

ringabout Aug 25, 2021 •

edited

Loading