Add edge bias feature to bias for testing edge cases. #515

matthew-russo · 2024-09-22T19:00:43Z

Works by randomly changing some generated values to be one of the random edge cases. The first tests are done with 100% edge cases, and then the chance goes down over the number of test cases until it's 0%.
The average amount of edge case tests can be controlled by edge_bias configuration option. It defaults to 0.25.
In failure the edge bias is persisted to the persistence file along with the seed, so it should be reproducible.
Also fix time::Duration test because the edge bias system found a problem in it. (Might need more modifications.)
Fix float_samplers test subsequent_splits_always_match_bounds.

replaces #369

* Works by randomly changing some generated values to be one of the random edge cases. The first tests are done with 100% edge cases, and then the chance goes down over the number of test cases until it's 0%. * The average amount of edge case tests can be controlled by edge_bias configuration option. It defaults to 0.25. * In failure the edge bias is persisted to the persistence file along with the seed, so it should be reproducible. * Also fix time::Duration test because the edge bias system found a problem in it. (Might need more modifications.) * Fix float_samplers test subsequent_splits_always_match_bounds.

rexmas · 2024-12-08T23:29:35Z

proptest/src/test_runner/failure_persistence/file.rs

+            seed = line[0..comma_position].parse::<PersistedSeed>().ok();
+            // The result is safe to ignore to not spam the log in case old
+            // cases exist, in which case edge_bias will be 0 to match old
+            // behaviour.


a little confused here, wouldn't the , already be the token dictating whether or not there's an edge bias?

also , is a fairly arbitrary delimeter. if we're extending the seed parsing maybe we can choose something more explicit so it doesn't conflict with future metadata someone may wish to add? how about @e= as the token, @ is not base-16 so a clear delimeter that this is new metadata, and e= meaning "edge equals". thoughts?

It might take me a while to remember what I wrote over a year ago, so I'm sorry if I make any mistakes here.

I think the idea wasn't to indicate that edge bias was used, but to handle and store the seed for the edge bias separately, so that the old persistence files would be compatible. Right now I'm not entirely sure if that's necessary, might need some more investigation/thinking.

rexmas · 2024-12-08T23:45:47Z

proptest/src/test_runner/failure_persistence/mod.rs

+    }
+}
+
+pub(crate) fn from_base16(dst: &mut [u8], src: &str) -> Option<()> {


this function looks a bit awkward because the return value is easy to ignore given dst, what if we returned a result that contained the dst value instead?

maybe something like:

pub(crate) fn from_base16<const N: usize>( src: &str, ) -> Result<[u8; N], &'static str> { if src.len() != N * 2 { return Err("Input length does not match expected buffer size."); } let mut buffer = [0u8; N]; for (i, chunk) in src.as_bytes().chunks(2).enumerate() { let hex_str = std::str::from_utf8(chunk) .map_err(|_| "Input contains invalid UTF-8.")?; buffer[i] = u8::from_str_radix(hex_str, 16) .map_err(|_| "Failed to parse hex chunk as hexadecimal.")?; } Ok(buffer) }

Honestly, I don't remember what I was thinking there... Sounds like a better approach.

rexmas · 2024-12-09T00:18:30Z

proptest/src/test_runner/runner.rs

+                    if x < p {
+                        self.current_edge_bias = 1f32 + x * (p - 1f32) / p;
+                    } else {
+                        self.current_edge_bias = (x - 1f32) * p / (p - 1f32);
+                    }


I think we should add a bit more documentation. maybe something like:
"""
Separate the curve into two linear parts whose total area
is p. Cases of p <= 0 and p >= 1 are handled above.

Before x reaches p: Bias decreases from 1.0 toward p.

After x reaches p: Bias decreases from p toward 0.0.

The first region ensures that edge cases are mixed into tests. The second region ensures we are testing all cases uniformly.
"""

Sounds like a pretty good description.

rexmas · 2024-12-09T00:39:26Z

proptest/src/arbitrary/_std/time.rs

+    static_map(any::<(u64, u32)>(), |(a, b)|
+    // Duration::new panics if nanoseconds are over one billion (one second)
+    // and overflowing into the seconds would overflow the second counter.
+    Duration::new(a, b % 1_000_000_000)


When the distribution is uniform I believe we can do a kind of rejection sampling here. since the distribution is uniform we can just select for the values we want in our range.

let mut nanos = b; while nanos >= 1_000_000_000 { nanos = runner.rng().gen(); } Duration::new(a, nanos)

however given we may have edge bias, that sort of throws this sampling off. maybe one option is when we detect if we have an edge value for b that is > 1_000_000_000 we pin to 1_000_000_000 since that is our edge. otherwise we perform uniform rejection sampling with the assumption that any non-edge value was chosen randomly and uniformly, and any edge value has negligible probability of appearing by comparison. thoughts?

side note - you wrote "Duration::new panics if nanoseconds are over one billion" but you have 1_000_000_000 % 1_000_000_000 = 0 which does not include 1_000_000_000, i'm assuming you wanted to include 1_000_000_000 however.

This one I remember something about. This change is more to fix a broken test that was detected because of the edge bias feature than to actually test the edge bias feature. The test was always broken, but it would only cause an issue if a happened to randomly end up as u64::MAX, which is very unlikely without an edge bias feature. And a bit of a justification for its existence.

I do know % 1_000_000_000 will have a slight bias towards lower numbers since the nanoseconds are u32, but in reality it shouldn't really matter. And yes, of course it will test a lot of Duration::new(a, 294967295) (where 294967295 is u32::MAX % 1_000_000_000), which is not very useful... but not very harmful either. Also, internally in Duration nanoseconds are u32 right now, but ::from_nanos for example returns u64, so I was worried about things like if someone changes the type to u64 without thinking, it would freeze the tests pretty badly if rejection sampling was used.

Actually, I think it already handles ranges well enough, too. Just that the original test didn't use them either. And actually I don't think it should use the ranges for the nanoseconds, since overflowing nanos into seconds is valid, as long as seconds are not u64::MAX (or close). Although right now I'm not exactly sure what the deeper meaning of the test's existence is...

you wrote "Duration::new panics if nanoseconds are over one billion" but you have 1_000_000_000 % 1_000_000_000 = 0 which does not include 1_000_000_000, i'm assuming you wanted to include 1_000_000_000 however.

Actually, you are correct. Although, in this case the bug is also in Rust documentation: https://doc.rust-lang.org/std/time/struct.Duration.html#method.new
States:

If the number of nanoseconds is greater than 1 billion (the number of nanoseconds in a second), then it will carry over into the seconds provided.

§Panics
This constructor will panic if the carry from the nanoseconds overflows the seconds counter.

It should say: "If the number of nanoseconds is greater OR EQUAL TO 1 billion", which you can see with

use std::time::Duration; fn main() { println!("{:?}", Duration::new(u64::MAX, 1_000_000_000 - 1)); println!("{:?}", Duration::new(u64::MAX, 1_000_000_000)); }

So using % 1_000_000_000 is correct, but the comment is incorrect.

So using % 1_000_000_000 is correct, but the comment is incorrect.

nice find!

And actually I don't think it should use the ranges for the nanoseconds, since overflowing nanos into seconds is valid, as long as seconds are not u64::MAX (or close).

ah i see, one option is we could simply use modulo if we have a value for seconds within some epsilon of u64::MAX on seconds. another option, not sure why i didn't think of this sooner, is making the range explicit (any::<u64>(), 0..1_000_000_001u32) where 1_000_000_001 is exclusive.

would either of those make sense?

And actually I don't think it should use the ranges for the nanoseconds, since overflowing nanos into seconds is valid, as long as seconds are not u64::MAX (or close).

Reading back what I wrote in the last comment doesn't make any sense. I was thinking it should be OK to test the overflowing thing, but of course it's not overflowing now, so it might as well use the range...

Or do something silly like

arbitrary!(Duration, SMapped<(u64, u32), Self>; static_map(any::<(u64, u32)>(), |(a, b)| let c = if b as u64 / 1_000_000_000 > u64::MAX - a { b % 1_000_000_000 } else { b }; // Duration::new panics if nanoseconds are one billion (one second) or over // and overflowing into the seconds would overflow the second counter. Duration::new(a, c) ));

I think in the end that test is not super important anyway... Personally I would just leave it as a range or modulo.

rexmas

I like the approach @ajantti left some comments

rexmas reviewed Dec 8, 2024

View reviewed changes

rexmas reviewed Dec 9, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add edge bias feature to bias for testing edge cases. #515

Add edge bias feature to bias for testing edge cases. #515

matthew-russo commented Sep 22, 2024

rexmas Dec 8, 2024

ajantti Dec 11, 2024

rexmas Dec 8, 2024

ajantti Dec 11, 2024

rexmas Dec 9, 2024 •

edited

Loading

ajantti Dec 11, 2024

rexmas Dec 9, 2024

ajantti Dec 11, 2024

rexmas Dec 15, 2024 •

edited

Loading

ajantti Dec 19, 2024

rexmas left a comment

Add edge bias feature to bias for testing edge cases. #515

Are you sure you want to change the base?

Add edge bias feature to bias for testing edge cases. #515

Conversation

matthew-russo commented Sep 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rexmas Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rexmas Dec 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rexmas left a comment

Choose a reason for hiding this comment

rexmas Dec 9, 2024 •

edited

Loading

rexmas Dec 15, 2024 •

edited

Loading