Better internal implementation for 'filtered choice' #1885

Zac-HD · 2019-03-20T12:49:32Z

As discussed in the shrinking guide, we have an internal trick that can make sampling from a filtered list much more efficient. It would be great to translate sampled_from(...).filter(...) into this form internally!

On the implementation side, the most elegant way to do this is probably to extend FilteredStrategy.do_draw with special cases for strategies that are choosing from a fixed number of elements, i.e.:

Choose a random element, and if it satisfies the filter return it. Try this three times, because rejection sampling has low overhead.
[new bit] If the wrapped strategy is a sampled_from of reasonable size (e.g. <1k). This could extend to booleans() and just() in future
- We're going to calculate a list of allowed values, pick of of them, and write that element's index in the full list to the buffer (see shrinking guide for details)
- Start by choosing an index k into the filtered list. We don't know how many there are, so just assume that it's unfiltered-1 (as we've failed a random draw).
- Create the filtered list until we get to the kth element. This is up to twice as fast on average as creating the full list - less if very few elements are allowed, but probably more when we're shrinking.
- If there are less than k allowed elements, pick a new k which is a valid index. If there are no valid elements, fall out of this condition - it'll be noted and marked invalid as if rejection sampling failed and we didn't try this special case.
- Write the index of the chosen element in the unfiltered list to the buffer, and return the element.
Give up, because if there is a valid element we can't find it.

Where this gets a bit hairy is in the interaction with LaxyStrategy and delayed validation, so we'll probably need to fiddle with the delegation of the do_draw or filter methods a bit to have it work. That's why we have this write-up of the discussion from #1862.

The text was updated successfully, but these errors were encountered:

Zac-HD · 2019-03-21T13:11:42Z

Per discussion on #1887, we should ensure that this is also implemented for unique collections via lists(..., unique=True).

Zac-HD added enhancement it's not broken, but we want it to be better test-case-reduction about efficiently finding smaller failing examples labels Mar 20, 2019

Zac-HD mentioned this issue Mar 21, 2019

Guidelines for well shrinking strategies #1887

Closed

Zac-HD self-assigned this Mar 23, 2019

Zac-HD mentioned this issue Mar 23, 2019

"Make our own luck" in sampled_from(...).filter(...) #1890

Closed

Zalathar mentioned this issue Mar 31, 2019

Add special-case filtering for SampledFromStrategy #1904

Merged

Zac-HD closed this as completed in #1904 Apr 3, 2019

pckroon mentioned this issue Sep 10, 2019

Implement better edge generation strategy pckroon/hypothesis-networkx#13

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better internal implementation for 'filtered choice' #1885

Better internal implementation for 'filtered choice' #1885

Zac-HD commented Mar 20, 2019

Zac-HD commented Mar 21, 2019

Better internal implementation for 'filtered choice' #1885

Better internal implementation for 'filtered choice' #1885

Comments

Zac-HD commented Mar 20, 2019

Zac-HD commented Mar 21, 2019