Support composable matchers #393

myronmarston · 2013-12-19T07:39:32Z

This is still a bit WIP but it's much further along than the last WIP PR and I wanted to open the PR and start the code review process. I applied the things discussed with @xaviershay and @JonRowe from #388.

Still TODO:

The last item in particular is stumping me -- I need help on that one! See the pending spec I left in match_array_spec.rb for what I'm talking about.

Even though I have a lot of TODOs left, feedback is welcome whenever anyone wants to give it. I'll be sure to comment here when I everything I'm planning to do is complete.

/cc @xaviershay @JonRowe @soulcutter @samphippen @alindeman

Closes #280.

xaviershay · 2013-12-21T02:23:58Z

For the match_array problem, one option would be to brute force permutation and pick the "best" match.

https://coderpad.io/078048

Sorry don't have enough time over the next week to flesh out the idea.

myronmarston · 2013-12-21T08:34:51Z

Thanks, @xaviershay. I pushed a branch with that implementation (brute-force-match-array) -- see 78bdd38. Unfortunately, it's ridiculously slow, because it's an O(N!) algorithm...which is is pretty much the worse time complexity you can get. Check out this growth (from def7863):

    Top 6 slowest examples (61.67 seconds, 100.0% of total time):
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 10 items
        56.41 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 9 items
        4.66 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 8 items
        0.53014 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 7 items
        0.05838 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 6 items
        0.00974 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 5 items
        0.00212 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269

Compared to what we had before:

    Top 6 slowest examples (0.00324 seconds, 100.0% of total time):
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 10 items
        0.00067 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 9 items
        0.00062 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 8 items
        0.00056 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 6 items
        0.00054 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 7 items
        0.00047 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269
      Composing `match_array` with other matchers expect(...).to match_array([matcher, matcher]) works with arrays of 5 items
        0.00038 seconds ./spec/rspec/matchers/built_in/match_array_spec.rb:269

A minute to compare two arrays of 10 elements each is clearly not going to fly. I think I'm going to go back and see if I can get the stable marriage algorithm to work, unless you have any better ideas.

On a side note, even what we had before is slower than I would like: 6 ms to evaluate one matcher expression is going to be the primary bottle neck in a true unit test (most of my unit tests average ~2ms or so). I think that's simply the nature of having to do all the comparisons for match_array to be as flexible as it is, so I'm not sure we can do anything about it, but we should probably add note to the docs, telling people that it's slow compared to expect(array.sort).to eq(%w[ a b c d e ]), so in fast unit tests we recommend using that instead of people can.

myronmarston · 2013-12-21T08:36:39Z

Actually, I realized I just read that wrong: I said it was taking 6 ms to evaluate the matcher expression but it looks like it's actually .6 ms...which isn't so bad for the awesome flexibility match_array gives you.

myronmarston · 2013-12-21T08:39:52Z

And, FWIW, here's an implementation of the stable marriage algorithm in ruby:

http://rosettacode.org/wiki/Stable_marriage_problem#Ruby

xaviershay · 2013-12-21T10:23:56Z

I expected slow, but not that slow.

myronmarston · 2013-12-21T16:34:28Z

I expected slow, but not that slow.

Yep, it was definitely slower than I expected. Then again, factorial is the fastest growing function I can think of...

myronmarston · 2013-12-23T18:39:51Z

So I've made a bit of progress on the matching algorithm. I eventually decided the stable marriage thing was a dead end (I couldn't find a way to model our problem in terms of the requirements of the stable marriage algorithm), but I've come up with an alternate approach that I think will perform quite well:

Compare every expected item against every actual item to get, for a each item, a list of matches from the other list.
Put all all expected items in a remaining_expected_items list, and all actual items in a remaining_actual_items list. This represents the items that still have to be dealt with: they must either be paired, or put in the final extra_items list or put in the final missing_items list.
Resolve any definitive results:
- Any expected items for which there are no matches get moved from remaining_expected_items to extra_items.
- Any actual items for which there are no matches get moved from remaining_actual_items to missing_items.
- Any expected/actual item pairs that are reciprocally matched only to each other get removed from remaining_expected_items and remaining_actual_items.
From here, do a brute force trial & error/backtracking approach:
- Take any expected item. Choose one of its actual item matches, and try pairing them. Remove them from remaining_expected_items and remaining_actual_items, and update the matches lists for each remaining item to no longer include these paired items (since we are eliminating them as options for any other items to pair with).
- Resolve any definitive results that come out of this trial, as explained in support diffing #3.
- Iterate.
As we do this, we keep track of the best solution seen so far. This will give us the extra_items and missing_items used in the failure message. If we get a solution that pairs everything up, we can abort early and consider the matcher to pass.

I feel pretty good about this algorithm, for a few reasons:

It's easy to reason about: I don't have a formal proof of its correctness, but it's pretty easy to see that as long as it's implemented correctly, it should always work properly.
Besides giving us a pass/fail status, this also gives us the extra_items and missing_items needed for the failure message.
It performs quite well (even though it boils down to a brute force approach in the end!). In the common case of every expected item matching against 1 actual item, the brute force part won't even come into play. It'll perform the same as it does now (which is basically O(N^2)). The number of brute force iterations required is a function of how many expected items match against multiple actual items (rather than being a function of array length, as the first attempt was).

I've got some of the plumbing for this solution already in place in a local branch. The part I'm struggling a bit with is the brute force iteration: we need to have a way to try fixing a pair to see where that goes, w/o destroying what we have so far, so that we can later pop the brute force stack and try a different pairing. I think that doing this easily basically comes down to having good abstractions, but I'm having a hard time naming things because it's already so abstract. (Plus, I've had a lot of distractions every time I've tried to work on this).

@xaviershay -- what do you think of that solution? Would you want to pair on it some time?

xaviershay · 2013-12-23T20:28:48Z

Algorithm sounds good. I'm not going to be able to pair on it before the new year, have quite a bit of travel and family stuff over the next week.

myronmarston · 2013-12-26T06:34:17Z

OK, I've implemented the algorithm discussed above. It's working, and it's much faster than the first brute force approach that @xaviershay and I tried. However, it still has some perf issues -- not based on how big the arrays are but based on how many duplicates/multi-matches there are. Also, I'm not super happy with the algorithm implementation -- it could definitely use some refactoring.

Include it in `failure_message` and `failure_message_when_negated` so the matchers don't have to override all 3 if they are just customization the description part of it.

We can leverage the instance state of the matcher object.

{:a => 1, :b => 2} ...reads much better than: {:a=>1, :b=>2}

- Leverage instance variable state. - Leverage the description/failure messages provided by BaseMatcher. - Remove ternaries.

xaviershay · 2014-01-01T22:17:34Z

lib/rspec/matchers/composable.rb

@@ -39,6 +41,98 @@ def or(matcher)
      def ===(value)
        matches?(value)
      end
+
+    private unless defined?(::YARD)


haven't seen this before ... because YARD doesn't include private methods? Does it include protected?

➜ rspec-mocks git:(master) yard doc --help | grep private --private Show private methods. (default hides private) --no-private Hide objects with @private tag

Private methods are hidden in the docs by default. This makes a lot of sense: private methods are generally not part of your public API.

Here I wanted these methods to be private (as they are only ever intended to be called from within a matcher on self) but since this is a mixin intended for users to use in their custom matchers, I wanted these methods doc'd. This definitely felt like a hack.

Do you think it's better to just make the methods public? I'm on the fence.

Making them ruby public is fine - we've docced them as api private

Actually, they are doc'd as api public. The problem here is that there are two notions of visibility:

Ruby's notion: private vs protected vs public. This affects whether you can call a method with an explicit receiver or not.

API notion: private vs public: Things docs as api private are subject to change in minor or patch releases with no deprecation warnings or notices. They are "off limits" for end users to use directly if they don't want their code breaking when they upgrade. API public means we are committed to not breaking it in a minor or patch release as per SemVer.

Here we've got methods we'd like to be ruby private but API public.

xaviershay · 2014-01-01T22:25:57Z

it suggests that it would fail if the the yielded arg is not a string, but it doesn't check that

I'm not too worried about this. In my experience most people don't think about explicitly checking types of their objects, and would expect this to pass with a string-like object. (Though it is a bit weird in ruby, string is an array-like object.)

Should we have rspec-expectations overwrite the rspec-mocks argument matchers?

Is there existing code in either library that soft depends on the other like this? If yes, we should keep that direction the same. If no, does pushing some part of the code down into rspec-support make sense?

myronmarston · 2014-01-01T22:29:20Z

Is there existing code in either library that soft depends on the other like this?

rspec-mocks defines a couple matchers (e.g. have_received), but it's a very soft dependency: it simply implements the same protocol, and it can work on its own w/o rspec-expectations.

If no, does pushing some part of the code down into rspec-support make sense?

Maybe -- I'll think more about that.

xaviershay · 2014-01-01T22:54:26Z

I'm ok with a soft-dependency, just as long as we don't have one both ways :)

- The `RSpec::Matchers` API was getting very large with all the aliases, which makes it harder to keep all in your head. - It's better to start with fewer and then add more in future releases as there is demand. Once they are there, we can't remove them from a future minor/patch release because of server. - `RSpec::Matchers.alias_matcher` is provided so users can define new aliases.

It was going to be very hard to document how this worked, and what phrasing forms are supported. Instead, we simply allow users to define their own aliases using `alias_matcher`.

We want to keep the solution with the fewest unmatched items, not the most unmatched items.

myronmarston · 2014-01-02T02:33:38Z

@xaviershay -- I think the perf is still unsatisfactory -- see 09734a7. I'm going to try to see if I can optimize it a bit.

There are still perf problems :(.

When recursing during the brute force depth-first search, pass along only the indeterminate indexes, so that it runs on a smaller subproblem. For this to work, we had to switch to using a hash, so that it can handle index gaps. This provides mixed results: it makes some things faster, some things slower. I think it's an improvement overall.

It's surprising how much difference this makes!

This is much, much faster than the full matching algorithm, and in common cases (e.g. arrays of numbers or strings) works just fine. See the benchmarks for how much of a difference this makes!

myronmarston · 2014-01-02T07:59:08Z

@xaviershay -- I found some fairly easy optimizations to make that make the common cases blazing fast, and make the slower cases about 2x faster. It's still slower than I would like it for those slower cases, but I'm at a point where I think I'm OK with it for now and can move on to something else. Let me know what you think of my optimizations -- I may have missed something or there may be further low-hanging fruit that these optimizations suggest.

[ci skip]

xaviershay · 2014-01-02T22:57:34Z

LGTM

Support composable matchers

xaviershay · 2014-01-02T22:59:07Z

So we can ship 3 now right? :P

myronmarston · 2014-01-02T23:04:45Z

So we can ship 3 now right? :P

Depends on your definition of "now" :).

JonRowe · 2014-01-02T23:06:34Z

I'm going to work on the formatter stuff more this weekend, when thats done then we can ship RSpec 3 :P

myronmarston · 2014-01-02T23:08:19Z

I'm going to work on the formatter stuff more this weekend, when thats done then we can ship RSpec 3 :P

Well, we can ship beta2. Then I plan to do my core/mocks/expectations code audit and see what else (if anything) we want before RC and final.

JonRowe · 2014-01-02T23:20:25Z

Aye, mostly was just reminding peeps more stuff existed ;)

myronmarston mentioned this pull request Dec 23, 2013

Consider renaming (or aliasing) match_array to contain_exactly #398

Closed

myronmarston added 20 commits December 30, 2013 21:48

Update base matcher to better leverage description.

b030ec5

Include it in `failure_message` and `failure_message_when_negated` so the matchers don't have to override all 3 if they are just customization the description part of it.

Refactor include matcher a bit: pass around fewer args.

ec07bb1

We can leverage the instance state of the matcher object.

Improve hash formatting used in include matcher.

aae1357

{:a => 1, :b => 2} ...reads much better than: {:a=>1, :b=>2}

Refactor start_with and end_with matchers.

1cb0e96

- Leverage instance variable state. - Leverage the description/failure messages provided by BaseMatcher. - Remove ternaries.

Remove old rcov artifacts.

cabb200

Rename spec file.

55c8686

Refactor to use the new FuzzyMatcher from rspec-support.

622c4c4

Make change composable with other matchers.

ce44330

Make include matcher fully composable.

0d739bb

Make raise_error matcher fully composable.

1d64bc0

Make match_array fully composable.

1e13557

Make start_with and end_with fully composable.

bbac8aa

Make throw_symbol fully composable.

3602666

Make yield_with_args fully compose with other matchers.

2c19cb4

Make yield_successive_args matcher composable with other matchers.

74154e5

Provide a default description override for alias matchers.

95a9048

Setup infrastructure to specify matcher aliases.

fef04e8

Add some more matcher aliases.

2671393

Add aliases for be_xyz predicate matchers.

c22c052

Add aliases for be operator value.

e9d22d1

xaviershay reviewed Jan 1, 2014
View reviewed changes

myronmarston added 5 commits January 1, 2014 15:52

Improve docs based on feedback from @xaviershay.

dd5cdd4

Remove method_missing support for dynamic predicate matcher aliases.

ce06693

It was going to be very hard to document how this worked, and what phrasing forms are supported. Instead, we simply allow users to define their own aliases using `alias_matcher`.

Improve docs and readability of PairingsMaximizer.

47c543d

Fix comparison operator.

8d33cfd

We want to keep the solution with the fewest unmatched items, not the most unmatched items.

myronmarston added 8 commits January 1, 2014 18:34

Add match_array/contain_exactly benchmarks.

09734a7

There are still perf problems :(.

Refactor: extract a solution struct.

4419b12

Use correct spelling of "indeterminate".

96bfc0f

Update pairings maximizer spec based on current API.

3ffd27e

Add rubyprof script so I can see troubleshoot perf issues.

1167771

Use our own values_match? logic for better perf.

f8a5b03

It's surprising how much difference this makes!

Try a fast sorted matching algorithm first.

ae12b4a

This is much, much faster than the full matching algorithm, and in common cases (e.g. arrays of numbers or strings) works just fine. See the benchmarks for how much of a difference this makes!

Add changelog entry for alias_matcher.

b0fde19

[ci skip]

myronmarston mentioned this pull request Jan 2, 2014

Reconcile duplicatated (but slightly different) matchers between rspec-mocks and rspec-expectations rspec/rspec-mocks#513

Closed

myronmarston added a commit that referenced this pull request Jan 2, 2014

Merge pull request #393 from rspec/support-composable-matchers

7736337

Support composable matchers

myronmarston merged commit 7736337 into master Jan 2, 2014

myronmarston deleted the support-composable-matchers branch January 2, 2014 22:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support composable matchers #393

Support composable matchers #393

myronmarston commented Dec 19, 2013

xaviershay commented Dec 21, 2013

myronmarston commented Dec 21, 2013

myronmarston commented Dec 21, 2013

myronmarston commented Dec 21, 2013

xaviershay commented Dec 21, 2013

myronmarston commented Dec 21, 2013

myronmarston commented Dec 23, 2013

xaviershay commented Dec 23, 2013

myronmarston commented Dec 26, 2013

xaviershay Jan 1, 2014

myronmarston Jan 1, 2014

xaviershay Jan 1, 2014

myronmarston Jan 1, 2014

xaviershay commented Jan 1, 2014

myronmarston commented Jan 1, 2014

xaviershay commented Jan 1, 2014

myronmarston commented Jan 2, 2014

myronmarston commented Jan 2, 2014

xaviershay commented Jan 2, 2014

xaviershay commented Jan 2, 2014

myronmarston commented Jan 2, 2014

JonRowe commented Jan 2, 2014

myronmarston commented Jan 2, 2014

JonRowe commented Jan 2, 2014

Support composable matchers #393

Support composable matchers #393

Conversation

myronmarston commented Dec 19, 2013

xaviershay commented Dec 21, 2013

myronmarston commented Dec 21, 2013

myronmarston commented Dec 21, 2013

myronmarston commented Dec 21, 2013

xaviershay commented Dec 21, 2013

myronmarston commented Dec 21, 2013

myronmarston commented Dec 23, 2013

xaviershay commented Dec 23, 2013

myronmarston commented Dec 26, 2013

xaviershay Jan 1, 2014

Choose a reason for hiding this comment

myronmarston Jan 1, 2014

Choose a reason for hiding this comment

xaviershay Jan 1, 2014

Choose a reason for hiding this comment

myronmarston Jan 1, 2014

Choose a reason for hiding this comment

xaviershay commented Jan 1, 2014

myronmarston commented Jan 1, 2014

xaviershay commented Jan 1, 2014

myronmarston commented Jan 2, 2014

myronmarston commented Jan 2, 2014

xaviershay commented Jan 2, 2014

xaviershay commented Jan 2, 2014

myronmarston commented Jan 2, 2014

JonRowe commented Jan 2, 2014

myronmarston commented Jan 2, 2014

JonRowe commented Jan 2, 2014