-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce generators that respect function domains #348
Conversation
6838d3b
to
7c236b8
Compare
Rebased to build on #349 |
107b4ab
to
a16e073
Compare
dc1a1da
to
385e850
Compare
@beetrees @quaternic you both suggested this at different points, would you mind reviewing this? I still have to wire up tests but I think the generator itself is good, see the above plots. The interesting parts are |
crates/libm-test/src/gen/domain.rs
Outdated
/// Number of values near an interesting point to check. | ||
const AROUND: usize = 100; | ||
|
||
/// Number of tests to run. | ||
const NTESTS: usize = { | ||
if cfg!(optimizations_enabled) { | ||
if crate::emulated() | ||
|| !cfg!(target_pointer_width = "64") | ||
|| cfg!(all(target_arch = "x86_64", target_vendor = "apple")) | ||
{ | ||
// Tests are pretty slow on non-64-bit targets, x86 MacOS, and targets that run | ||
// in QEMU. | ||
100_000 | ||
} else { | ||
5_000_000 | ||
} | ||
} else { | ||
// Without optimizations just run a quick check | ||
800 | ||
} | ||
}; | ||
|
||
/// Some functions have infinite asymptotes, limit how many we check. | ||
const MAX_ASYMPTOTES: usize = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to refactor this all a bit considering I just copied and pasted NTESTS from the random tests. Still brainstorming how to do this (input appreciated) but I'm thinking maybe something like:
- Have
N
number of tests by default, which is 5M in the existing code (but probably needs to be reduced based on the below) f32
unary functions runN
random testsf64
andf128
unary functions runN * 4
random tests- If the function takes two inputs, multiply the number of tests by 4. If it takes three inputs, multiply by 8.
- If a domain test exists for the function, run
N
domain-based tests and reduce the number of random tests by a factor of 100 - If an exhaustive test for
f32
(not yet implemented) or high-iteration test (f64
/f128
or multi-input functions, also not implemented) should be run, still run the "interesting points" portion of domain-based tests but replace thelogspace
tests with whatever fits. I should probably split this into two generators rather than chaining...
Basically in all cases, either the exhaustive tests or the logspace are going to consume the bulk of time. But we still want to run "interesting points" and the random tests in hopes that they will find errors earlier than waiting for the whole exhaustive check to run. Also random tests should cover some signaling NaNs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a reasonable plan to start with, especially checking the "interesting" cases first will make development easier. Ultimately the total number of test cases that get run is mainly a function of how much CI time we're willing to spend running them; it might be worth deciding on a rough estimate for that and then tuning the total test count to fit within it. Also, all the "magic" somewhat-arbitrary constants (AROUND
, NTESTS
, MAX_ASYMPTOTES
etc.) for how many of each type of test to run should probably be centralised in a single file somewhere to make it easier to keep track of what tests are being run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now have this as part of #364.
86fe390
to
668ea32
Compare
pub const ALL_LEN: usize = 240; | ||
|
||
/// All non-infinite non-NaN values of `f8` excluding `-0`. | ||
pub const ALL: [Self; Self::ALL_LEN] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could a compile time for loop be used to generate this array instead of listing all the values out manually? Something like:
pub const ALL: [Self; Self::ALL_LEN] = {
let mut all = [Self(0); Self::ALL_LEN];
let mut i = 0;
let mut next = 0b1_1110_111;
while next >= 0b1_0000_000 {
all[i] = Self(next);
i += 1;
next -= 1;
}
let mut next = 0b0_0000_000;
while next <= 0b0_1110_111 {
all[i] = Self(next);
i += 1;
next += 1;
}
assert!(i == Self::ALL_LEN);
all
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Thought I replied before) it definitely could be simplified, I just like having having the table available as a quick reference.
crates/libm-test/src/gen/domain.rs
Outdated
/// Number of values near an interesting point to check. | ||
const AROUND: usize = 100; | ||
|
||
/// Number of tests to run. | ||
const NTESTS: usize = { | ||
if cfg!(optimizations_enabled) { | ||
if crate::emulated() | ||
|| !cfg!(target_pointer_width = "64") | ||
|| cfg!(all(target_arch = "x86_64", target_vendor = "apple")) | ||
{ | ||
// Tests are pretty slow on non-64-bit targets, x86 MacOS, and targets that run | ||
// in QEMU. | ||
100_000 | ||
} else { | ||
5_000_000 | ||
} | ||
} else { | ||
// Without optimizations just run a quick check | ||
800 | ||
} | ||
}; | ||
|
||
/// Some functions have infinite asymptotes, limit how many we check. | ||
const MAX_ASYMPTOTES: usize = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a reasonable plan to start with, especially checking the "interesting" cases first will make development easier. Ultimately the total number of test cases that get run is mainly a function of how much CI time we're willing to spend running them; it might be worth deciding on a rough estimate for that and then tuning the total test count to fit within it. Also, all the "magic" somewhat-arbitrary constants (AROUND
, NTESTS
, MAX_ASYMPTOTES
etc.) for how many of each type of test to run should probably be centralised in a single file somewhere to make it easier to keep track of what tests are being run.
668ea32
to
39e3127
Compare
94fc25e
to
13f4c6c
Compare
4e77dcc
to
48f8d74
Compare
crates/libm-test/src/domain.rs
Outdated
/// The start of this domain, saturating at negative infinity. | ||
pub fn range_start(&self) -> F { | ||
match self.start { | ||
Bound::Included(v) | Bound::Excluded(v) => v, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bound::Included(v) | Bound::Excluded(v) => v, | |
Bound::Included(v) => v, | |
Bound::Excluded(v) => v.next_up(), |
This is used as the inclusive bound by get_test_cases
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this intentionally since it seems the bound is probably worth checking even if it is excluded, but yeah this isn't technically correct. Any idea how to express this better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding count_up()
/count_down()
calls for domain.range_start()
and domain.range_end()
to edge_cases::get_test_cases()
so that the values around each edge of the domain are thoroughly tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, added this. (Also removed the is_infinite
checks within count_up
/count_down
since they don't serve much purpose and allow this to be cleaner).
8c0b796
to
f54e3fa
Compare
Since these add new API but do not affect runtime, we can enable it for all tests that run with nightly.
Add a constant for negative pi and provide a standalone const `from_bits`, which can be combined with what we already had in `hex_float`. Also provide another default method to reduce what needs to be provided by the macro.
Introduce `f8`, which is an 8-bit float compliant with IEEE-754. This type is useful for testing since it is easily possible to enumerate all values.
Create a type representing a function's domain and a test that does a logarithmic sweep of points within the domain.
Introduce a generator that will tests various points of interest including zeros, infinities, and NaNs.
For visualization, add a simple script for generating scatter plots and a binary (via examples) to plot the inputs given various domains.
f54e3fa
to
f4d97cd
Compare
I think this is in a pretty reasonable state so I am going to go ahead and merge. As always, further reviews are welcome. |
Introduce a
Domain
type for encoding function input range. This trait is used ingen::domain
to create sequences of values that are either (1) around interesting points of this domain, or (2) logarithmically spaced within the domain.Compared to the random generators, this means that we don't waste time checking large quantities of different NaNs or out of bound inputs (e.g. negative numbers and NaNs take up more than half the float space, this would be wasted checking
sqrt
which is only defined for x >= 0). It also means we know that coverage is uniform across the entire domain.Currently only unary operations are supported.
This also includes a
f8
type that is just helpful for testing ULP ops since it is easily possible to list all values. I was going to remove this, but it turned out to be useful enough that I think I'll keep it around for future development.