Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initRand now uses strict monotonic counter to guarantee uniqueness #18149

Closed
wants to merge 11 commits into from

Conversation

timotheecour
Copy link
Member

@timotheecour timotheecour commented Jun 2, 2021

note

the CI failure seems to suggest getMonoTime is in fact not monotonic on some OS? that's worrysome (EDIT: that's expected; getCpuTicks would be strict monotonic at least on modern cpus)

future work

  • std/tempfiles should still not call initRand() on each call to randomPathName but should instead use a threadvar rand state, maybe (code doesn't look re-entrant)
  • docs should clarify whether getMonoTime can return same timestamp in 2 different threads; if so, then initRand should mix getMonoTime() with getThreadId() to guarantee uniqueness (EDIT: even within same thread there is no such guarantee because it's not strict monotonic; but getCpuTicks would be strict monotonic at least on modern cpus)
  • investigate std/monotimes not strictly monotonic on Linux_amd64 and windows #18158 (marking checkbox because issue has a tracker)
  • make initRand() work with --experimental:vmopsDanger
  • proc mach_absolute_time(): int64 {.importc, header: "<mach/mach.h>".} in std/monotimes has wrong signature, differing from one in system/timers (see https://developer.apple.com/documentation/kernel/1462446-mach_absolute_time)

lib/std/monotimes.nim Outdated Show resolved Hide resolved
@timotheecour timotheecour added the TODO: followup needed remove tag once fixed or tracked elsewhere label Jun 2, 2021
@timotheecour timotheecour marked this pull request as ready for review June 2, 2021 22:23
@timotheecour timotheecour added the Ready For Review (please take another look): ready for next review round label Jun 2, 2021
@Varriount
Copy link
Contributor

Just curious: why use the current time, rather than pulling from /dev/random or /dev/urandom (*nix), or CryptGenRandom (Windows)?

@timotheecour
Copy link
Member Author

Just curious: why use the current time, rather than pulling from /dev/random or /dev/urandom (*nix), or CryptGenRandom (Windows)?

well we now have std/sysrand instead of having to call those directly, so it's an option, but IIRC performance can be a concern, definitely worth trying though; also, we need a vmops for whichever option is used in the end so that initRand() can work at CT (minor, easy point). That said, a strict monotonic counter is always a good tool to have, regardless of availability of sysrand

tests/stdlib/tmonotimes.nim Outdated Show resolved Hide resolved
@Varriount
Copy link
Contributor

Varriount commented Jun 3, 2021

Well then, I propose using the system's source of randomness as a seed. If someone needs, for some odd reason, to initialize thousands of state machines using a faster method, they can do so manually.

Using a monotonic clock, even a strict one, doesn't necessarily guarantee unique values for each call. A random source of data may return the same value consecutively, but that can't be known ahead of time (assuming the system implementation is sound).

@timotheecour
Copy link
Member Author

timotheecour commented Jun 3, 2021

PTAL, addressed comment; this PR is an improvement over status quo.

Well then, I propose using the system's source of randomness as a seed. If someone needs, for some odd reason, to initialize thousands of state machines using a faster method, they can do so manually.

the benchmark below shows that using urandom from std/sysrand is 90x slower than getCpuTicks (more details later, this should guarantee uniqueness) and 20x slower than getMonoTime (from this PR) so urandom is not a good default IMO; user can always do that by calling initRand(getSeedFromUrandom)

import std/sysrand
import std/monotimes
import timn/exp/cputicks # cf upcoming PR

template algo1(buf, c) =
  let ok = urandom(buf)
  doAssert ok
  c += cast[int](buf[0].addr)

template algo2(buf, c) =
  let t = getCpuTicks()
  c += cast[int](t)

template algo3(buf, c) =
  let t = getMonoTime()
  c += cast[int](t.ticks)

template mainAux(algo)=
  let n = 10000
  var buf: array[8, byte]
  var c = 0
  let t1 = getCpuTicks()
  for i in 0..<n:
    algo(buf, c)
  let t2 = getCpuTicks()
  echo (astToStr(algo), c, t2 - t1)

proc main()=
  for i in 0..<10:
    echo()
    mainAux(algo1)
    mainAux(algo2)
    mainAux(algo3)
main()

@Varriount
Copy link
Contributor

the benchmark below shows that using urandom from std/sysrand is 90x slower than getCpuTicks (more details later, this should guarantee uniqueness) and 20x slower than getMonoTime (from this PR) so urandom is not a good default IMO; user can always do that by calling initRand(getSeedFromUrandom)

So this seems to be an argument of performance vs correctness. Either we use a source of cryptographic randomness as the seed, or a high-resolution monotonic timer. Is this accurate?

@timotheecour
Copy link
Member Author

So this seems to be an argument of performance vs correctness. Either we use a source of cryptographic randomness as the seed, or a high-resolution monotonic timer. Is this accurate?

yes, but i have a PR in the work that will give both performance and correctness; until then this PR is an improvement over status quo

@Varriount
Copy link
Contributor

Varriount commented Jun 4, 2021

If there's a PR in the works for a proper fix, there's little point in a PR for an improper one, unless there's something time-critical going on.

@timotheecour
Copy link
Member Author

timotheecour commented Jun 4, 2021

If there's a PR in the works for a proper fix, there's little point in a PR for an improper one, unless there's something time-critical going on.

with this logic, nothing ever gets done (it's not the 1st time). This PR is good enough to close #17898 given the clock resolution and cost of initRand, and improves several other things if you read the PR content. The PR I have in the work could potentially be controversial, who knows (and the other things in this PR are still useful regardless); there's no point in blocking on it when reusing existing getMonoTime already improves status quo.

@timotheecour
Copy link
Member Author

PTAL

@Varriount
Copy link
Contributor

This does not fix #17898, it just makes it less likely.

Multiple decisions need to be made here:

  • To what degree should initRand prevent two calls in rapid succession from using the same seed?
  • Should randomPathName only generate names for paths that don't exist?

In my opinion, it's up to the caller of randomPathName to check that the path doesn't already exist. initRand, because it isn't typically called very often, can get away with sacrificing performance for uniqueness, however no specific guarantee should be made regarding how unique.

@Varriount
Copy link
Contributor

I would also love to know why randomPathName is initializing a new random state each time. If you're going to do that, you might as well just use the current time anyway.

@Araq
Copy link
Member

Araq commented Jun 11, 2021

I would also love to know why randomPathName is initializing a new random state each time. If you're going to do that, you might as well just use the current time anyway.

I agree completely, this is not good.

@timotheecour
Copy link
Member Author

timotheecour commented Jul 1, 2021

3 possible venues:

  • use atomics as done in std/oids which solves a very similar problem; can be augmented with thread-id + skipRandomNumbers to be robust to multiple threads returning same value
  • use a {.threadvar.} Rand state
  • expose API getCpuTicks from RDTSC instruction wherever it's available (refs add getCpuTicks based on RDTSC instruction for highest-performance counters  timotheecour/Nim#773, PR in the works), which is useful for many things (including as replacement for monotimes / cpuTime since it provides much higher resolution and lower overhead that those). This one is useful regardless of its use in this context. It provides guaranteed monotonicity within 1 thread (and by extension multiple threads since we can mix-in threadid; note that there's also an RDTSC instruction that returns the CPU id in addition to the tick). Portability is a potential concern but for benchmarking it's not a problem as it's an optional tool.

@ringabout ringabout changed the title fix #17898 initRand now uses monotonic time to guarantee uniqueness initRand now uses monotonic time to guarantee uniqueness Aug 22, 2021
@ringabout
Copy link
Member

This PR may not fix #17898 completely, but it could be a improvement for random module.

@timotheecour
Copy link
Member Author

timotheecour commented Aug 23, 2021

@Araq PTAL:

which also addresses the previously raised concerns

example

  • on OSX, getCpuTicks has 4.5X less overhead than getMonoTime() and 71X less overhead than cpuTime(), see https://gist.github.com/timotheecour/e5d85be09b141b4bf9f877e1c8731025 (-d:case1); in other OS's, the gap is even larger
  • on OS's other than OSX, getMonoTime() is not strictly monotonic and can't be used in a meaningful way to measure code under a certain number of instructions (see -d:case2)

future work

links

@timotheecour timotheecour changed the title initRand now uses monotonic time to guarantee uniqueness initRand now uses strict monotonic counter to guarantee uniqueness Aug 23, 2021
else:
let now = times.getTime()
result = initRand(convert(Seconds, Nanoseconds, now.toUnix) + now.nanosecond)
result = initRand(getCpuTicks())
Copy link
Contributor

@arnetheduck arnetheduck Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this constrains the random module to work only on platforms where high-resolution tick counters are available - given that cputicks depends on non-standard behavior, it severly limits the platforms where Nim can be used - how to init rand is not a performance-critical operation - in fact, it would be trivial to continue using standardised C API for this without any significant loss - random is not a cryptographic random source, it's a best-effort proposition upon which no code that actually requires randomness should rely upon - the granularity doesn't not change the utility of the module for any use cases for which its use is appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't; fallback code can always be added to getCpuTicks to return something similar to what std/monotimes returns. rdtsc is available on all x86 processors since the pentium, and other platforms that nim supports have equivalent instructions which can be wrapped by getCpuTicks, see google/benchmark code here https://github.com/google/benchmark/blob/v1.1.0/src/cycleclock.h#L116 which handles more platform than nim supports.

Copy link
Member

@ringabout ringabout Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and other platforms that nim supports have equivalent instructions which can be wrapped by getCpuTicks

Do you mean it will be implemented in the future? Then change the random.nim after they are implement I think. Anyway let's wait for #18743.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see also nim-lang/RFCs#414 which would allow testing from presence of __rdtsc programmatically, at CT

@stale
Copy link

stale bot commented Sep 8, 2022

This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions.

@stale stale bot added the stale Staled PR/issues; remove the label after fixing them label Sep 8, 2022
@ringabout
Copy link
Member

It was done differently by #18744

@ringabout ringabout closed this Sep 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready For Review (please take another look): ready for next review round stale Staled PR/issues; remove the label after fixing them TODO: followup needed remove tag once fixed or tracked elsewhere
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants