-
-
Notifications
You must be signed in to change notification settings - Fork 939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uniform distribution in helpers.arrayElements results #1765
Comments
As you found already out, this is a no-go 🙅 Is this also affecting other parts of faker and we need to change something underlying? Or is this only the precision passed inside the If possible we should define a test to cover the probability with a max delta-error 🤔 |
The problem can be fixed by passing in a different precision from |
The float imprecision should already be covered by #1675
|
We should probably rewrite this line to make it more readable. faker/src/modules/helpers/index.ts Line 598 in 5f5be20
|
Note i deliberately kept the precision of 0.01 to avoid too much churn on the snapshot tests when i implemented #1675, but yes after that is merged, that line should be rewritten to take advantage of the higher precision floats now available. |
@mjomble I originally wanted wanted to let you write a fix for this, but then spend a lot of time trying to understand the issue/code, that I ended up writing the PR myself. Sorry about that. Will be fixed in #1770. While working on this issue I also found #1771 If you would like to contribute that, then please a comment there. |
Totally fine by me 😄 Thanks for the fix! |
Clear and concise description of the problem
I want faker.helpers.arrayElements to always return each element with the same probability.
However, in certain situations, some indexes are picked far more often than others.
For example, when sampling 10 elements from an array with 1000 elements, indexes ending in 9 are are picked ~20 times more often than indexes ending in 1. And when sampling 9 or fewer elements, indexes ending in 1 seem to be never picked at all.
Suggested solution
The root cause of the problem is that arrayElements picks array indices using
faker.number.float()
here with the default precision of 0.01This works fine with an array of 100 elements, but as the length grows, anomalies begin to appear.
The simplest solution would be to replace
this.faker.number.float({ max: 0.99 })
withMath.random()
. This would, however, break some deterministic test cases.Alternative
We could also use something like
this.faker.number.float({ max: 0.999999999, precision: 0.000000001 })
but I'm not sure what the best number of digits is.The precision could also be conceivably derived from the length of the given array.
If I get some suggestions from maintainers, I may be able to submit a PR.
Additional context
Code to reproduce issue:
Some outputs:
Both should be much closer to 1000, which is the case when the same code is run on a fixed version of arrayElements:
The text was updated successfully, but these errors were encountered: