Poor codegen for string search (i.e., `memchr`) with iterators

A colleague provided me with [this benchmark](https://quick-bench.com/q/uYm7Wboe6G5KSEsDEXJTtH5oX7M), which compares the following equivalent C++ and Rust:

```cc
bool HasMoreThanTwoDashes(std::string_view sv) {
    return sv.find_first_of('-', sv.find_first_of('-')+1) != std::string_view::npos;
}
```

```rust
fn has_more_than_two_dashes(s: &[u8]) -> bool {
  sl.iter().filter(|&&c| c == b'-').count() > 2
}
```

C++ performs 20x better in the microbenchmark. We haven't tried very hard to figure out why, but our suspicion is that Rust's implementation of `memchr` is worse than the one provided by the system that ran the benchmark. C++ also bails out early, and Rust does not, but that appears not to matter because the dashes are at the far end of the strings.

We're not actually sure what CPU quick-bench ran this with, but I can run it on my (icelake, I think) Xeon later. I think the difference in memchr implementations is worth investigating regardless.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Poor codegen for string search (i.e., `memchr`) with iterators #94573

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Poor codegen for string search (i.e., memchr) with iterators #94573

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Poor codegen for string search (i.e., `memchr`) with iterators #94573