Update substring search to use the Two Way algorithm #26327

bluss · 2015-06-15T22:13:48Z

Update substring search to use the Two Way algorithm

To improve our substring search performance, revive the two way searcher
and adapt it to the Pattern API.

Fixes #25483, a performance bug: that particular case now completes faster
in optimized rust than in ruby (but they share the same order of magnitude).

Many thanks to @gereeter who helped me understand the reverse case
better and wrote the comment explaining next_back in the code.

I had quickcheck to fuzz test forward and reverse searching thoroughly.

The two way searcher implements both forward and reverse search,
but not double ended search. The forward and reverse parts of the two
way searcher are completely independent.

The two way searcher algorithm has very small, constant space overhead,
requiring no dynamic allocation. Our implementation is relatively fast,
especially due to the byteset addition to the algorithm, which speeds
up many no-match cases.

A bad case for the two way algorithm is:

let haystack = (0..10_000).map(|_| "dac").collect::<String>();
let needle = (0..100).map(|_| "bac").collect::<String>());

For this particular case, two way is not much faster than the naive
implementation it replaces.

rust-highfive · 2015-06-15T22:13:55Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @aturon (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

bluss · 2015-06-15T22:15:03Z

cc @BurntSushi, you know substring search very well, cc @Kimundi about the pattern API

bluss · 2015-06-15T22:22:16Z

I'll show some benchmark results. My workspace for this is temporarily hosted as a repo, if you want to check the quickcheck & benchmark implementations.

find and rfind are current libcore's str::find(&str) and str::rfind(&str). tw_contains is two-way forward search to the first match (if any) and tw_contains_rev is two-way reverse search, and tw_paper is an unmodified transcription of two-way from the paper.

I got a bit pessimistic when I tried out the different bad cases I could find. The performance bug #25483 is the benchmark called aa_100k below. From experience, the main improvement in runtime for that benchmark comes from the byteset optimization, not due to the TW algorithm itself.

Look at the short_short and short_long benchmarks for something of a more normal case for substring search.

Edit: Updated to current benchmarks. The old results are archived.

running 45 tests
test bad_naive::find                                    ... bench:       2,235 ns/iter (+/- 31) = 33 MB/s
test bad_naive::rfind                                   ... bench:       2,184 ns/iter (+/- 60) = 33 MB/s
test bad_naive::tw_contains                             ... bench:         418 ns/iter (+/- 201) = 177 MB/s
test bad_naive::tw_contains_rev                         ... bench:         528 ns/iter (+/- 119) = 140 MB/s
test bad_naive::tw_paper                                ... bench:         585 ns/iter (+/- 124) = 126 MB/s
test bad_naive_reversed::find                           ... bench:         657 ns/iter (+/- 41) = 112 MB/s
test bad_naive_reversed::rfind                          ... bench:         696 ns/iter (+/- 216) = 106 MB/s
test bad_naive_reversed::tw_contains                    ... bench:         263 ns/iter (+/- 22) = 281 MB/s
test bad_naive_reversed::tw_contains_rev                ... bench:         507 ns/iter (+/- 341) = 145 MB/s
test bad_naive_reversed::tw_paper                       ... bench:         588 ns/iter (+/- 56) = 125 MB/s
test pathological_aa_100k::find                         ... bench:     943,632 ns/iter (+/- 16,813) = 105 MB/s
test pathological_aa_100k::rfind                        ... bench:   1,004,048 ns/iter (+/- 9,096) = 99 MB/s
test pathological_aa_100k::tw_contains                  ... bench:       3,386 ns/iter (+/- 49) = 29533 MB/s
test pathological_aa_100k::tw_contains_rev              ... bench:       3,764 ns/iter (+/- 102) = 26567 MB/s
test pathological_aa_100k::tw_paper                     ... bench:     706,501 ns/iter (+/- 1,893) = 141 MB/s
test pathological_aab_100k::find                        ... bench:   3,708,442 ns/iter (+/- 38,586) = 80 MB/s
test pathological_aab_100k::rfind                       ... bench:   3,820,770 ns/iter (+/- 76,434) = 78 MB/s
test pathological_aab_100k::tw_contains                 ... bench:   1,004,961 ns/iter (+/- 66,086) = 298 MB/s
test pathological_aab_100k::tw_contains_rev             ... bench:   1,063,171 ns/iter (+/- 278,171) = 282 MB/s
test pathological_aab_100k::tw_paper                    ... bench:   2,236,639 ns/iter (+/- 100,281) = 134 MB/s
test pathological_two_way_10k::find                     ... bench:     279,777 ns/iter (+/- 1,326) = 107 MB/s
test pathological_two_way_10k::rfind                    ... bench:     297,624 ns/iter (+/- 12,199) = 100 MB/s
test pathological_two_way_10k::tw_contains              ... bench:     114,635 ns/iter (+/- 28,467) = 261 MB/s
test pathological_two_way_10k::tw_contains_rev          ... bench:       2,961 ns/iter (+/- 22) = 10131 MB/s
test pathological_two_way_10k::tw_paper                 ... bench:     223,103 ns/iter (+/- 8,931) = 134 MB/s
test pathological_two_way_20k::find                     ... bench:     937,518 ns/iter (+/- 4,423) = 106 MB/s
test pathological_two_way_20k::rfind                    ... bench:     997,079 ns/iter (+/- 4,680) = 100 MB/s
test pathological_two_way_20k::tw_contains              ... bench:     362,096 ns/iter (+/- 34,481) = 276 MB/s
test pathological_two_way_20k::tw_contains_rev          ... bench:       4,975 ns/iter (+/- 130) = 20100 MB/s
test pathological_two_way_20k::tw_paper                 ... bench:     729,971 ns/iter (+/- 17,161) = 136 MB/s
test pathological_two_way_20k_reversed::find            ... bench:     679,464 ns/iter (+/- 3,195) = 88 MB/s
test pathological_two_way_20k_reversed::rfind           ... bench:     706,299 ns/iter (+/- 5,581) = 84 MB/s
test pathological_two_way_20k_reversed::tw_contains     ... bench:       3,209 ns/iter (+/- 33) = 18697 MB/s
test pathological_two_way_20k_reversed::tw_contains_rev ... bench:     179,106 ns/iter (+/- 18,190) = 334 MB/s
test pathological_two_way_20k_reversed::tw_paper        ... bench:     447,336 ns/iter (+/- 20,443) = 134 MB/s
test short_long::find                                   ... bench:      28,582 ns/iter (+/- 4,112) = 89 MB/s
test short_long::rfind                                  ... bench:      30,111 ns/iter (+/- 432) = 84 MB/s
test short_long::tw_contains                            ... bench:       2,398 ns/iter (+/- 665) = 1063 MB/s
test short_long::tw_contains_rev                        ... bench:       2,233 ns/iter (+/- 78) = 1142 MB/s
test short_long::tw_paper                               ... bench:      15,741 ns/iter (+/- 431) = 162 MB/s
test short_short::find                                  ... bench:         201 ns/iter (+/- 3) = 278 MB/s
test short_short::rfind                                 ... bench:         402 ns/iter (+/- 28) = 139 MB/s
test short_short::tw_contains                           ... bench:         102 ns/iter (+/- 6) = 549 MB/s
test short_short::tw_contains_rev                       ... bench:         116 ns/iter (+/- 7) = 482 MB/s
test short_short::tw_paper                              ... bench:         419 ns/iter (+/- 6) = 133 MB/s

gereeter · 2015-06-15T23:07:31Z

src/libcore/str/pattern.rs

+        'search: loop {
+            // Check that we have room to search in
+            if self.position + needle.len() > haystack.len() {
+                if start == haystack.len() {


Should this check be done outside of the loop? We will only return Done on the first iteration, if ever.

There is no real loop -- it's used to return back to this point. So this if check can be visited multiple times per call to next.

hm I see now, that's a good point. it evolved into this.

bluss · 2015-06-16T10:11:36Z

Thanks for your extensive review @gereeter! I've pushed a commit to address some of this, with some cleanup. I also realized I could move the break for Reject until after the fast reject by the byteset. This improves the benchmarks for the forward search uniformly.

bluss · 2015-06-16T10:23:55Z

The benchmark list was very long, so here's just some improvements after the later Rejects

commit 1:
test pathological_aa_100k::tw_contains                  ... bench:      16,289 ns/iter (+/- 18) = 6139 MB/s
test pathological_aa_100k::tw_contains_rev              ... bench:       7,196 ns/iter (+/- 197) = 13896 MB/s
test short_long::tw_contains                            ... bench:       7,733 ns/iter (+/- 100) = 329 MB/s
test short_long::tw_contains_rev                        ... bench:       5,910 ns/iter (+/- 41) = 431 MB/s
test short_short::tw_contains                           ... bench:         164 ns/iter (+/- 0) = 341 MB/s
test short_short::tw_contains_rev                       ... bench:         169 ns/iter (+/- 1) = 331 MB/s

commit 1 + 2:
test pathological_aa_100k::tw_contains                  ... bench:       3,533 ns/iter (+/- 420) = 28304 MB/s
test pathological_aa_100k::tw_contains_rev              ... bench:       3,840 ns/iter (+/- 252) = 26041 MB/s
test short_long::tw_contains                            ... bench:       3,322 ns/iter (+/- 18) = 767 MB/s
test short_long::tw_contains_rev                        ... bench:       3,343 ns/iter (+/- 44) = 763 MB/s
test short_short::tw_contains                           ... bench:         113 ns/iter (+/- 2) = 495 MB/s
test short_short::tw_contains_rev                       ... bench:         124 ns/iter (+/- 2) = 451 MB/s

bluss · 2015-06-16T12:16:46Z

Now I feel good about the reverse algorithm, thanks @gereeter. I've split out the add-on commits to make it easier to review hopefully, so I'll squash before merge later. Edited: Now rebased to just two commits.

At this point it should be algorithmically safe and sound. Short of switching algorithm, I guess followup changes can focus on performance optimizations. Sad to say, the loops are indexed, not using iterators, and they perform better the way they are than using .zip with two slice iterators.

~~I've corrected & edited the main PR message a bit, I will incorporate those in the commit log when I rebase.~~ Done.

gereeter · 2015-06-16T15:52:46Z

src/libcore/str/pattern.rs

+
+            // We have found a match!
+            let match_pos = self.end - needle.len();
+            self.end -= needle.len(); // add self.period for all matches


To update this comment for next_back and make it more clear, maybe this should read "subtract self.period instead of needle.len() for overlapping matches" (similarly, the next case could be changed, but with add, not subtract)

I'll update this

gereeter · 2015-06-16T16:37:34Z

This looks really good! Short of doing word-level things, which is messy and requires unsafe and should probably be in a separate PR, I'm pretty much out of ideas on how to speed this up. I might add more docs and change some things stylistically, but that can be done later, and all the content here looks good. I think this should be merged, assuming the change to the Pattern API is cleared.

bluss · 2015-06-16T18:29:46Z

@gereeter I found a paper on that kind of extension to this algorithm http://drops.dagstuhl.de/opus/volltexte/2011/3355/

gereeter · 2015-06-16T19:10:37Z

That is the paper I had in mind when I made my comment - I saw your comment on #25483.

bluss · 2015-06-17T10:58:22Z

Ok, the twoway search can be faster if instead of emitting Rejects, we just search to the next match. Let the StrSearcher handle emitting the implied intermediate Rejects. I'm going to incorporate that now, because it means we don't have to change the pattern API's specification.

I found this by looking at what @BurntSushi's regex does.

bluss · 2015-06-17T12:35:53Z

Results of moving reject handling out of the algorithm, that gave it some air to optimize the loop to find a match better. The reverse case is sped up the same way, showing results just for the forward case.

before:
test bad_naive::tw_contains                             ... bench:         796 ns/iter (+/- 191) = 92 MB/s
test bad_naive_reversed::tw_contains                    ... bench:         297 ns/iter (+/- 9) = 249 MB/s
test pathological_aa_100k::tw_contains                  ... bench:       3,512 ns/iter (+/- 582) = 28473 MB/s
test pathological_aab_100k::tw_contains                 ... bench:   1,951,935 ns/iter (+/- 57,089) = 153 MB/s
test pathological_two_way_10k::tw_contains              ... bench:     241,923 ns/iter (+/- 4,562) = 123 MB/s
test pathological_two_way_20k::tw_contains              ... bench:     749,007 ns/iter (+/- 39,225) = 133 MB/s
test pathological_two_way_20k_reversed::tw_contains     ... bench:       3,216 ns/iter (+/- 55) = 18656 MB/s
test short_long::tw_contains                            ... bench:       3,338 ns/iter (+/- 16) = 764 MB/s
test short_short::tw_contains                           ... bench:         112 ns/iter (+/- 1) = 499 MB/s

after:
test bad_naive::tw_contains                             ... bench:         418 ns/iter (+/- 201) = 177 MB/s
test bad_naive_reversed::tw_contains                    ... bench:         263 ns/iter (+/- 22) = 281 MB/s
test pathological_aa_100k::tw_contains                  ... bench:       3,386 ns/iter (+/- 49) = 29533 MB/s
test pathological_aab_100k::tw_contains                 ... bench:   1,004,961 ns/iter (+/- 66,086) = 298 MB/s
test pathological_two_way_10k::tw_contains              ... bench:     114,635 ns/iter (+/- 28,467) = 261 MB/s
test pathological_two_way_20k::tw_contains              ... bench:     362,096 ns/iter (+/- 34,481) = 276 MB/s
test pathological_two_way_20k_reversed::tw_contains     ... bench:       3,209 ns/iter (+/- 33) = 18697 MB/s
test short_long::tw_contains                            ... bench:       2,398 ns/iter (+/- 665) = 1063 MB/s
test short_short::tw_contains                           ... bench:         102 ns/iter (+/- 6) = 549 MB/s

bluss · 2015-06-17T12:37:23Z

Updated to add two new commits. Will squash to one commit after review is done, and I'll update the commit logs with the fact that we don't need to change the Pattern API.

gereeter · 2015-06-17T14:25:59Z

First, that improvement is impressive enough that I think we should keep the new coalesing of Rejects even if it has some negative consequences.

Second, it seems that the primary users who benifit from smaller Rejects are is_prefix_of and is_suffix_of, which should be overridden to not use two way searching at all anyway.

bluss · 2015-06-17T15:36:08Z

I just looked at benchmarks for is_prefix_of, and even before the latest Reject change, the results were 3 to 220 times in the old naive string search's favour.

We need to implement all the trait's methods in the long run, and yes, use something else (memcmp-like) for the prefix and suffix versions.

Kimundi · 2015-06-17T16:36:53Z

@bluss: Yeah, the old impls never got optimized.
As for uses, I'm specifically referring to the .trim_...() methods: They skip forward until the point where the first Reject appears, and then stop. Example:

"abababcccccccccccccccccccccccccccccccccab".trim_left_matches("ab")
   => "cccccccccccccccccccccccccccccccccab"

If the string searcher where to be written with only Match instead, then the algorithm would first scan through the whole string until the final "ab" just to emit the start position of the Reject.

Having the duplication is unfortunate, but I don't see a better way apart from ensuring the generic codegen will optimize out the right cases on its own.

bluss · 2015-06-17T16:42:27Z

I just addressed is_prefix_of / is_suffix_of, so they are a bit faster with this change than the naive impl they replace.

Finding the first reject could be done simpler too, we could special case just that part maybe.

Kimundi · 2015-06-17T16:57:01Z

@bluss: It doesn't have to match the naive impl exactly. It can split the Rejects up at arbitrary points if it helps.

bluss · 2015-06-18T11:07:31Z

I've experimented back and forth (macros, closures for specialization) and it looks like duplicating the forward and backward searches is the best unfortunately.

However, StrSearcher is a very specific algorithm for finding matches, and it is a suboptimal choice for reject-only search. In that case naive search is always better and any time we spend precomputing needle properties is wasted. For that reason, it would be best if this was given its own method on the Pattern API that circumvents the StrSearcher. For example fn maximal_prefix_of(haystack: &str) -> Option<usize> and so on.

I'm away for the weekend, so enjoy midsummer everyone!

bors · 2015-06-18T21:36:40Z

☔ The latest upstream changes (presumably #26192) made this pull request unmergeable. Please resolve the merge conflicts.

@gereeter

To improve our substring search performance, revive the two way searcher and adapt it to the Pattern API. Fixes rust-lang#25483, a performance bug: that particular case now completes faster in optimized rust than in ruby (but they share the same order of magnitude). Much thanks to @gereeter who helped me understand the reverse case better and wrote the comment explaining `next_back` in the code. I had quickcheck to fuzz test forward and reverse searching thoroughly. The two way searcher implements both forward and reverse search, but not double ended search. The forward and reverse parts of the two way searcher are completely independent. The two way searcher algorithm has very small, constant space overhead, requiring no dynamic allocation. Our implementation is relatively fast, especially due to the `byteset` addition to the algorithm, which speeds up many no-match cases. A bad case for the two way algorithm is: ``` let haystack = (0..10_000).map(|_| "dac").collect::<String>(); let needle = (0..100).map(|_| "bac").collect::<String>()); ``` For this particular case, two way is not much faster than the naive implementation it replaces.

Use a trait to be able to implement both the fast search that skips to each match, and the slower search that emits `Reject` intervals regularly. The latter is important for uses of `next_reject`.

bluss · 2015-06-21T18:11:13Z

Rebased.

Updated with:

Use a trait to split the algorithm in two cases without code duplication. The cases are the MatchOnly strategy, and the RejectAndMatch strategy. This way we both offer the quickest possible next_match() impl and a general next() that emits Rejects too.
The RejectAndMatch case is less streamlined for now.
The RejectAndMatch case manually steps to the next character boundary so that the Pattern API can stay as it is.

This is needed to not drop performance, after the trait-based changes. Force separate versions of the next method to be generated for the short and long period cases.

bluss · 2015-06-24T16:44:05Z

Here are some updated benchmarks with caveats.

rustc nightly is updated, llvm is updated, so things might have changed
I can't seem to get cpu scaling to fall into the same gear again that produced the previous pretty consistent results
So: Don't compare these literally to the previous benchmarks.

The cases are:

find: current forward search
rfind: current forward search
tw_contains: new forward search in this PR
tw_contains_rev: new forward search in this PR

running 36 tests
test bad_naive::find                                    ... bench:       1,386 ns/iter (+/- 121) = 53 MB/s
test bad_naive::rfind                                   ... bench:       1,463 ns/iter (+/- 19) = 50 MB/s
test bad_naive::tw_contains                             ... bench:         470 ns/iter (+/- 7) = 157 MB/s
test bad_naive::tw_contains_rev                         ... bench:         314 ns/iter (+/- 18) = 235 MB/s
test bad_naive_reversed::find                           ... bench:         533 ns/iter (+/- 90) = 138 MB/s
test bad_naive_reversed::rfind                          ... bench:         570 ns/iter (+/- 39) = 129 MB/s
test bad_naive_reversed::tw_contains                    ... bench:         263 ns/iter (+/- 13) = 281 MB/s
test bad_naive_reversed::tw_contains_rev                ... bench:         434 ns/iter (+/- 200) = 170 MB/s
test pathological_aa_100k::find                         ... bench:     765,670 ns/iter (+/- 51,184) = 130 MB/s
test pathological_aa_100k::rfind                        ... bench:     823,406 ns/iter (+/- 1,706) = 121 MB/s
test pathological_aa_100k::tw_contains                  ... bench:       3,893 ns/iter (+/- 32) = 25687 MB/s
test pathological_aa_100k::tw_contains_rev              ... bench:       3,848 ns/iter (+/- 97) = 25987 MB/s
test pathological_aab_100k::find                        ... bench:   2,704,123 ns/iter (+/- 99,663) = 110 MB/s
test pathological_aab_100k::rfind                       ... bench:   2,824,903 ns/iter (+/- 163,534) = 105 MB/s
test pathological_aab_100k::tw_contains                 ... bench:   1,149,895 ns/iter (+/- 73,320) = 260 MB/s
test pathological_aab_100k::tw_contains_rev             ... bench:   1,120,750 ns/iter (+/- 136,251) = 267 MB/s
test pathological_two_way_10k::find                     ... bench:     227,447 ns/iter (+/- 157) = 131 MB/s
test pathological_two_way_10k::rfind                    ... bench:     244,858 ns/iter (+/- 14,582) = 122 MB/s
test pathological_two_way_10k::tw_contains              ... bench:     136,919 ns/iter (+/- 13,430) = 219 MB/s
test pathological_two_way_10k::tw_contains_rev          ... bench:       3,208 ns/iter (+/- 18) = 9351 MB/s
test pathological_two_way_20k::find                     ... bench:     762,829 ns/iter (+/- 1,937) = 131 MB/s
test pathological_two_way_20k::rfind                    ... bench:     820,271 ns/iter (+/- 780) = 121 MB/s
test pathological_two_way_20k::tw_contains              ... bench:     426,550 ns/iter (+/- 35,134) = 234 MB/s
test pathological_two_way_20k::tw_contains_rev          ... bench:       5,263 ns/iter (+/- 10) = 19000 MB/s
test pathological_two_way_20k_reversed::find            ... bench:     505,162 ns/iter (+/- 44,729) = 118 MB/s
test pathological_two_way_20k_reversed::rfind           ... bench:     538,927 ns/iter (+/- 30,293) = 111 MB/s
test pathological_two_way_20k_reversed::tw_contains     ... bench:       3,564 ns/iter (+/- 16) = 16834 MB/s
test pathological_two_way_20k_reversed::tw_contains_rev ... bench:     179,662 ns/iter (+/- 3,419) = 333 MB/s
test short_long::find                                   ... bench:      23,059 ns/iter (+/- 118) = 110 MB/s
test short_long::rfind                                  ... bench:      24,691 ns/iter (+/- 75) = 103 MB/s
test short_long::tw_contains                            ... bench:       2,126 ns/iter (+/- 4) = 1199 MB/s
test short_long::tw_contains_rev                        ... bench:       2,040 ns/iter (+/- 16) = 1250 MB/s
test short_short::find                                  ... bench:         160 ns/iter (+/- 1) = 350 MB/s
test short_short::rfind                                 ... bench:         311 ns/iter (+/- 1) = 180 MB/s
test short_short::tw_contains                           ... bench:          81 ns/iter (+/- 58) = 691 MB/s
test short_short::tw_contains_rev                       ... bench:          99 ns/iter (+/- 0) = 565 MB/s

bluss · 2015-06-24T16:47:22Z

I'll tentatively nominate this for the beta because I had hoped I would get this in before we branched for 1.2. It's a big performance improvement, no other perks.

gereeter · 2015-06-26T16:26:45Z

I just realized that I never actually said this, so LGTM.

bluss · 2015-06-26T22:12:58Z

@gereeter Thank you, you've provided a lot of valuable help & feedback!

aturon · 2015-06-30T15:26:04Z

@bors: r+

Thanks for doing this!

bors · 2015-06-30T15:26:06Z

📌 Commit 274bb24 has been approved by aturon

bors · 2015-06-30T18:09:52Z

⌛ Testing commit 274bb24 with merge 7fc0675...

@gereeter

Update substring search to use the Two Way algorithm To improve our substring search performance, revive the two way searcher and adapt it to the Pattern API. Fixes #25483, a performance bug: that particular case now completes faster in optimized rust than in ruby (but they share the same order of magnitude). Many thanks to @gereeter who helped me understand the reverse case better and wrote the comment explaining `next_back` in the code. I had quickcheck to fuzz test forward and reverse searching thoroughly. The two way searcher implements both forward and reverse search, but not double ended search. The forward and reverse parts of the two way searcher are completely independent. The two way searcher algorithm has very small, constant space overhead, requiring no dynamic allocation. Our implementation is relatively fast, especially due to the `byteset` addition to the algorithm, which speeds up many no-match cases. A bad case for the two way algorithm is: ``` let haystack = (0..10_000).map(|_| "dac").collect::<String>(); let needle = (0..100).map(|_| "bac").collect::<String>()); ``` For this particular case, two way is not much faster than the naive implementation it replaces.

bors · 2015-06-30T20:02:12Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-x-android-t, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-win-gnu-32-nopt-t, auto-win-gnu-32-opt, auto-win-gnu-64-nopt-t, auto-win-gnu-64-opt

alexcrichton · 2015-06-30T20:59:23Z

triage: not accepting for a beta backport

rust-highfive assigned aturon Jun 15, 2015

gereeter reviewed Jun 15, 2015
View reviewed changes

bluss force-pushed the two-way branch from c3d3be7 to 2e97929 Compare June 16, 2015 10:14

bluss changed the title ~~core: Update StrSearch to use the Two Way searcher~~ core: Update StrSearcher to use the Two Way searcher Jun 16, 2015

bluss changed the title ~~core: Update StrSearcher to use the Two Way searcher~~ core: Update StrSearcher to use the Two Way algorithm Jun 16, 2015

gereeter reviewed Jun 16, 2015
View reviewed changes

bluss force-pushed the two-way branch 3 times, most recently from 167bad3 to 00e97bf Compare June 16, 2015 23:09

Ulrik Sverdrup added 3 commits June 21, 2015 19:58

StrSearcher: Specialize is_prefix_of/is_suffix_of for &str

a6dd203

StrSearcher: Use trait to specialize two way algorithm by case

71006bd

Use a trait to be able to implement both the fast search that skips to each match, and the slower search that emits `Reject` intervals regularly. The latter is important for uses of `next_reject`.

bluss force-pushed the two-way branch from d495342 to 71006bd Compare June 21, 2015 18:05

bluss changed the title ~~core: Update StrSearcher to use the Two Way algorithm~~ Update substring search to use the Two Way algorithm Jun 21, 2015

bluss added the relnotes Marks issues that should be documented in the release notes of the next release. label Jun 21, 2015

StrSearcher: Explicitly separate the long and short cases

274bb24

This is needed to not drop performance, after the trait-based changes. Force separate versions of the next method to be generated for the short and long period cases.

bluss added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jun 24, 2015

bors merged commit 274bb24 into rust-lang:master Jun 30, 2015

bluss deleted the two-way branch June 30, 2015 20:12

alexcrichton removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jun 30, 2015

bluss mentioned this pull request Aug 2, 2015

Implement the complete reverse case for the two way algorithm #27474

Merged

Update substring search to use the Two Way algorithm #26327

Update substring search to use the Two Way algorithm #26327

Uh oh!

Conversation

bluss commented Jun 15, 2015

Uh oh!

rust-highfive commented Jun 15, 2015

Uh oh!

bluss commented Jun 15, 2015

Uh oh!

bluss commented Jun 15, 2015

Uh oh!

gereeter Jun 15, 2015

Choose a reason for hiding this comment

Uh oh!

bluss Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

bluss Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

bluss commented Jun 16, 2015

Uh oh!

bluss commented Jun 16, 2015

Uh oh!

bluss commented Jun 16, 2015

Uh oh!

gereeter Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

bluss Jun 16, 2015

Choose a reason for hiding this comment

Uh oh!

gereeter commented Jun 16, 2015

Uh oh!

bluss commented Jun 16, 2015

Uh oh!

gereeter commented Jun 16, 2015

Uh oh!

bluss commented Jun 17, 2015

Uh oh!

bluss commented Jun 17, 2015

Uh oh!

bluss commented Jun 17, 2015

Uh oh!

gereeter commented Jun 17, 2015

Uh oh!

bluss commented Jun 17, 2015

Uh oh!

Kimundi commented Jun 17, 2015

Uh oh!

bluss commented Jun 17, 2015

Uh oh!

Kimundi commented Jun 17, 2015

Uh oh!

bluss commented Jun 18, 2015

Uh oh!

bors commented Jun 18, 2015

Uh oh!

bluss commented Jun 21, 2015

Uh oh!

bluss commented Jun 24, 2015

Uh oh!

bluss commented Jun 24, 2015

Uh oh!

gereeter commented Jun 26, 2015

Uh oh!

bluss commented Jun 26, 2015

Uh oh!

aturon commented Jun 30, 2015

Uh oh!

bors commented Jun 30, 2015

Uh oh!

bors commented Jun 30, 2015

Uh oh!

bors commented Jun 30, 2015

Uh oh!

alexcrichton commented Jun 30, 2015

Uh oh!

Uh oh!