Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probability Distribution and Statistical Functions -- Uniform Distribution Module #272

Merged
merged 193 commits into from
Oct 6, 2021

Conversation

Jim-215-Fisher
Copy link
Contributor

@Jim-215-Fisher Jim-215-Fisher commented Dec 20, 2020

This is part of probability distribution and statistical functions for uniform distribution. In addition to original real variable functions, integer and complex variables related functions were added. The files uploaded are:

src/stdlib_stats_distribution_uniform.fypp
src/CMakeLists.txt
src/Makefile.manual
src/tests/stats/test_distribution_uniform.f90
src/tests/CMakeLists.txt
src/tests/Makefile.manual
doc/specs/index.md
doc/specs/stdlib_stats_distribution_uniform.md

@Jim-215-Fisher Jim-215-Fisher changed the title Distribution uniform Probability Distribution and Statistical Functions -- uniform distribution Dec 20, 2020
@Jim-215-Fisher Jim-215-Fisher changed the title Probability Distribution and Statistical Functions -- uniform distribution Probability Distribution and Statistical Functions -- Uniform Distribution Module Dec 20, 2020
! Based on the paper by Frederic Goualard, "Generating Random Floating-
! Point Numbers By Dividing Integers: a Case Study", Proceedings of
! ICCS 2020, June 20202, Amsterdam, Netherlands
!
Copy link

@peteroupc peteroupc Dec 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Goualard mentions dividing integers by a Mersenne number to make a floating-point number to make a point that doing so can lead to non-uniformity issues, and not necessarily to show that libraries should generate random floating-point numbers this way. He makes the same point with dividing integers by a power of 2.

Also, there are ways to implement uniform floating-point number generation that covers all numbers the real type covers in [0, 1], but in general they are far from trivial which is why I don't demand it here. Examples include the Rademacher Floating-Point Library by @christophe-conrads as well as an algorithm I give.

Copy link
Contributor Author

@Jim-215-Fisher Jim-215-Fisher Dec 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite understand your point. Yes, dividing Mersenne integer leads to non-uniformity. Multiplying the inverse of the same number reduces the issue significantly.

Copy link

@peteroupc peteroupc Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is indeed to reduce non-uniformity, especially since this code is intended to form part of the standard library and thus be used by thousands of applications with different requirements, and while some of them may not care about the statistical quality of pseudorandom numbers, others may. In this sense, of the methods that involve multiplying or dividing an integer by a constant to create a binary64, the "best" one is to multiply a binary64 integer in [1, 2^53) (which is representable in binary64) by the binary64 constant 1/2^53. This is "better" than multiplying an integer by a Mersenne number or its inverse, because it "better" ensures the binary64 values are evenly spaced in [0, 1] as the result of the multiplication is representable in binary64. This difference is illustrated if we multiply 5201 by either 1/2^53 or 1/2^53-1 (in exact rational arithmetic), then convert the result to binary64 and back. (Notice that I put "best" and "better" in quotes here, in part because, in my opinion, floating-point random numbers should be avoided whenever random integers or random rational numbers suffice.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on Goualard paper, the best one is a binary64 integer in [1, 2^53) multiplying a float of 1/(2^53-1) as shown in Fig 7. in the paper. This is exactly what I implemented in the module. Dividing binary64 integer by 2^53 leads to more non-uniformity as GNU Fortran 9.2.0 does.

@milancurcic
Copy link
Member

@jvdp1 @Jim-215-Fisher Can this PR go forward or are there still some changes needed?

@jvdp1
Copy link
Member

jvdp1 commented Sep 26, 2021

@jvdp1 @Jim-215-Fisher Can this PR go forward or are there still some changes needed?

Links for FORD should be checked before merging. I'll try to check ASAP

@Jim-215-Fisher
Copy link
Contributor Author

All corrections have been applied except links for FORD.

@jvdp1 @Jim-215-Fisher Can this PR go forward or are there still some changes needed?

@jvdp1
Copy link
Member

jvdp1 commented Oct 2, 2021

All corrections have been applied except links for FORD.

@jvdp1 @Jim-215-Fisher Can this PR go forward or are there still some changes needed?

@Jim-215-Fisher I opened a PR with some changes regarding links, meta-info, and specs in your Github directory Jim-215-Fisher#25 . When this is merged I will approve this PR.

@gareth-nx gareth-nx mentioned this pull request Oct 3, 2021
@Jim-215-Fisher
Copy link
Contributor Author

All corrections have been applied except links for FORD.

@jvdp1 @Jim-215-Fisher Can this PR go forward or are there still some changes needed?

@Jim-215-Fisher I opened a PR with some changes regarding links, meta-info, and specs in your Github directory Jim-215-Fisher#25 . When this is merged I will approve this PR.

@jvdp1 The PR has been applied and merged. All checks have passed.

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Jim-215-Fisher for this PR. LGTM.

@milancurcic @awvwgk @ivan-pi Could you have a last look at this PR, and merge it if OK?

@gareth-nx @epagone Could you approve this PR, please, if you are happy with it as is?

Copy link

@epagone epagone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much!

@jvdp1
Copy link
Member

jvdp1 commented Oct 4, 2021

Thank you @Jim-215-Fisher . It seems that all comments and requests were answered. If no other comments before tomorrow, then I will merge it.

@jvdp1
Copy link
Member

jvdp1 commented Oct 6, 2021

Thank you @Jim-215-Fisher for this PR. I'll merge it.

@jvdp1 jvdp1 merged commit 089f325 into fortran-lang:master Oct 6, 2021
@Jim-215-Fisher Jim-215-Fisher deleted the Distribution-Uniform branch October 7, 2021 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reviewers needed This patch requires extra eyes topic: statistics Statistical functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants