Improve list shrinking #199

jwoudenberg · 2017-07-27T20:37:19Z

This PR aims to improve the performance of shrinking large lists. I've noticed fuzz tests occasionally hang when shrinking lists containing large fuzzed items. Especially when you're fuzzing nested list structures shrinking can take a while because it shrinks one item at a time.

This PR contains two changes:

It no longer tries the empty list every other shrinking step, which I believe resolves Shrinker repeats values a lot / Max's new idea for shrinker implementation #187.
It will optimistically try to delete the first and last half of the list before attempting more costly list shrinking strategies.

In the benchmarks (which I made marginally more realistic) these changes result in an over 100x performance improvement for shrinking a list of ints.

It is debatable whether I'm picking 'the right halves' to try and throw away. Other possible strategies include removing all even or odd elements or removing at random. My feeling, unsupported by evidence, is that when a test fails on a large list, 9 out of 10 times the problem will be relatively simple. That means there's decent odds the problem lies entirely within either half of the list. Further optimisations might be possible for the case where you're discovering a complex problem arising from many elements in the list working in concert, but waiting a bit longer for the shrinker in those scenario's seems ok to me for now.

The same optimisation can be made for string shrinking, but this PR does not contain it because the string shrinker lives in elm-shrink.

Resolves #187. Because of the way the list shrinker was set up, it would try to shrink into an empty list every other step. This PR changes the behaviour to try shrinking into an empty list once at the start of the shrinking.

The list shrinker would only ever shrink a single item per step. For long lists this can take a while. This adds an additional shrinking strategy: throwing away the first or last half of the list. For long randomly generated lists there's decent odds one of these two attempts will succeed.

mgold · 2017-07-28T00:13:10Z

src/Fuzz.elm

+                        Lazy.List.fromList [ dropFirstHalf listOfTrees, dropSecondHalf listOfTrees ]
+                            |> Lazy.force
+            else
+                Lazy.List.empty


So what happens when we have a list of 1-4 elements? Say I claim that all list of ints are sorted. Will it find [1, 0] as a minimal counterexample?

This logic comes in addition to and not instead of the existing 'remove one element' and 'shrink one element' strategies. It adds an extra high-reward strategy as the first thing to try, but always falls back to the more fine-grained strategies if that doesn't work. So the minimal counterexample you mention is still found (I checked it to be sure).

This conditional merely says shrinking by halving should not be attempted for lists smaller than 4 elements, shrinking by a single element is still attempted like before. The reason I added this conditional: Once we're at lists sizes smaller than 4 'halving' might sometimes simply remove a single element, which is something the algorithm was already going to try in the 'remove one' phase anyway. To prevent us from trying the same shrink twice this conditional was introduced.

Added a comment to clarify this better in the code.

Nice! Can you a test to make sure that minimal counterexample is still found? There should be some shrinking tests you can copy.

drathier

Nice optimization. I've only got a few minor comments :)

drathier · 2017-07-29T20:21:49Z

benchmarks/Snippets.elm

+                    Expect.pass
+
+                _ ->
+                    Expect.fail "Failed"


Expect.notEquals []?
I like the comment.

drathier · 2017-07-29T20:30:17Z

src/Fuzz.elm

+                            |> Lazy.force
+            else
+                -- For lists of three elements or less, halving and removing a single element is often going to be the same thing.
+                -- To prevent us from attempting the same shrunken value twice, we're disabling the halving strategy for these small lists.


Could we move this comment up to the first line of the if-statement, so that you have it right there when you wonder what that magic number is doing?

I'd probably replace 4 with a slightly larger number (8 maybe) and put a comment like "the list halving shortcut is only useful for large lists" instead of the current comment, since it reads more easily. Alternatively, we could remove the if-statement all together to make the code even easier to read. How much performance impact does the duplicate work have? Is it noticeable?

I think the idea is that if you halve a list of at most three elements, you could be left with an empty or singleton list. 4 is the smallest number that guarantees each half will not be one of those degenerate cases.

Yeah, sure, but that's fundamentally an optimization; the fuzzer will do (more) duplicate work if you leave that check out, but it'll still be correct. My reasoning is that if it's an performance optimization, it should look like one, so you can glance over it quickly if you just want to know what this function does.

I like this argument!

jwoudenberg · 2017-08-03T08:58:23Z

Thanks for all the feedback and comments! I'm enjoying some holiday this week, will address everything next week.

jwoudenberg · 2017-08-12T11:48:04Z

Thanks for your reviews @drathier and @mgold, it's really appreciated! I followed up on your feedback and would love a second review.

mgold

Looks good either than one stylistic change.

mgold · 2017-08-12T20:55:34Z

src/Fuzz.elm

+    {- This extends listShrinkRecurse algorithm with an attempt to shrink directly to the empty list. -}
+    case listShrinkRecurse listOfTrees of
+        Rose root children ->
+            Rose root (Lazy.List.cons (RoseTree.singleton []) children)


I like to avoid case expressions with only one case. I think this should work instead: let (Rose root children) = listShrinkRecurse listOfTrees in ...

Feels like there should be a mapping function for the children field here.

listShrinkHelp : List (RoseTree a) -> RoseTree (List a) listShrinkRecurse listOfTrees |> mapChildren (Lazy.List.cons (RoseTree.singleton [])) withChildren newChildren (Rose root oldChildren) = Rose root newChildren

Good idea! I made the change.

drathier

I forgot to press submit yesterday :( Ah well, just a minor comment on listShrinkHelp.

drathier · 2017-08-12T21:06:35Z

src/Fuzz.elm

+    {- This extends listShrinkRecurse algorithm with an attempt to shrink directly to the empty list. -}
+    case listShrinkRecurse listOfTrees of
+        Rose root children ->
+            Rose root (Lazy.List.cons (RoseTree.singleton []) children)


Feels like there should be a mapping function for the children field here.

listShrinkHelp : List (RoseTree a) -> RoseTree (List a) listShrinkRecurse listOfTrees |> mapChildren (Lazy.List.cons (RoseTree.singleton [])) withChildren newChildren (Rose root oldChildren) = Rose root newChildren

jwoudenberg · 2017-08-13T17:17:43Z

Thanks for the feedback @drathier, I made the change.

I kept mapChildren as a helper function in the src/Fuzz.elm module rather than a library-esque function in src/RoseTree.elm, because it has a bit of a weird signature that exposes the internals of the RoseTree implementation (specifically the lazy list used to contain its children).

jwoudenberg · 2017-08-24T18:14:26Z

Cool, merging. Thanks for the reviews!

jwoudenberg added 3 commits July 27, 2017 22:19

Try shrinking to the empty list once.

92d272b

Resolves #187. Because of the way the list shrinker was set up, it would try to shrink into an empty list every other step. This PR changes the behaviour to try shrinking into an empty list once at the start of the shrinking.

Made list benchmark more realistic

d5c8b29

mgold reviewed Jul 28, 2017

View reviewed changes

Explain reason for not halving short lists

c3fbea1

drathier suggested changes Jul 29, 2017

View reviewed changes

mgold mentioned this pull request Jul 29, 2017

Planning for 0.19 and 5.0.0 #191

Closed

11 tasks

mgold approved these changes Aug 1, 2017

View reviewed changes

jwoudenberg added 3 commits August 12, 2017 13:14

Simplify benchmarking code

d15d220

Clarify list halving is a performance optimization

6172971

Show list shrinking finds smallest counter example

fc6f070

mgold suggested changes Aug 12, 2017

View reviewed changes

Get rid of single-case case statement

a0e74ca

drathier reviewed Aug 13, 2017

View reviewed changes

Simplify listShrinkHelper

404eea5

mgold approved these changes Aug 13, 2017

View reviewed changes

drathier approved these changes Aug 14, 2017

View reviewed changes

drathier assigned jwoudenberg Aug 14, 2017

drathier added fuzzers performance labels Aug 14, 2017

jwoudenberg merged commit c5a87c1 into elm-community:master Aug 24, 2017

jwoudenberg deleted the improve-list-shrinking branch August 24, 2017 18:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve list shrinking #199

Improve list shrinking #199

jwoudenberg commented Jul 27, 2017 •

edited

Loading

mgold Jul 28, 2017

jwoudenberg Jul 28, 2017 •

edited

Loading

jwoudenberg Jul 28, 2017

mgold Jul 28, 2017

drathier left a comment

drathier Jul 29, 2017

drathier Jul 29, 2017

mgold Jul 29, 2017

drathier Jul 30, 2017

jwoudenberg Aug 12, 2017

jwoudenberg commented Aug 3, 2017

jwoudenberg commented Aug 12, 2017

mgold left a comment

mgold Aug 12, 2017

drathier Aug 12, 2017

jwoudenberg Aug 13, 2017

drathier left a comment

drathier Aug 12, 2017

jwoudenberg commented Aug 13, 2017

jwoudenberg commented Aug 24, 2017

Improve list shrinking #199

Improve list shrinking #199

Conversation

jwoudenberg commented Jul 27, 2017 • edited Loading

Choose a reason for hiding this comment

jwoudenberg Jul 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drathier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwoudenberg commented Aug 3, 2017

jwoudenberg commented Aug 12, 2017

mgold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drathier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwoudenberg commented Aug 13, 2017

jwoudenberg commented Aug 24, 2017

jwoudenberg commented Jul 27, 2017 •

edited

Loading

jwoudenberg Jul 28, 2017 •

edited

Loading