Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds SpanSplitEnumerator #104534

Merged

Conversation

bbartels
Copy link
Contributor

@bbartels bbartels commented Jul 7, 2024

Implements #934

TODO:

  • Add SplitAny tests
  • Add SearchValues tests
  • Add documentation for public API

IEquatable<T> type constraint is necessary due to the type constraint existing on the System.MemoryExtensions.IndexOf<T>() overloads and on SearchValues<T>.

Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

1 similar comment
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jul 7, 2024
@stephentoub
Copy link
Member

@bbartels, please let us know when this is ready for review.

@bbartels
Copy link
Contributor Author

@bbartels, please let us know when this is ready for review.

Will do, will finish off over the weekend!

@stephentoub
Copy link
Member

Ok, thanks

@bbartels bbartels marked this pull request as ready for review July 15, 2024 01:19
@bbartels
Copy link
Contributor Author

@stephentoub ready for review 🙂

@stephentoub stephentoub self-assigned this Jul 15, 2024
@stephentoub
Copy link
Member

Thanks. Given the time pressure, I'm making changes in a commit locally which I'll push to your branch for you to explore.

@stephentoub
Copy link
Member

Done. @bbartels, please take a look. Most important changes:

  • Changed SplitAny to treat an empty char separators as do string.Split and the existing MemoryExtensions.SplitAny, where it means any Unicode whitespace
  • Reduced overheads in MoveNext by removing some unnecessary branches and Slice.

@stephentoub stephentoub added this to the 9.0.0 milestone Jul 15, 2024
@bbartels
Copy link
Contributor Author

bbartels commented Jul 15, 2024

Changed SplitAny to treat an empty char separators as do string.Split and the existing MemoryExtensions.SplitAny, where it means any Unicode whitespace

Was this discussed in api review? I have no recollection. I could see that being rather concerning behaviour special casing this for a single type.

@stephentoub
Copy link
Member

stephentoub commented Jul 15, 2024

Was this discussed in api review? I have no recollection. I could see that being rather concerning behaviour special casing this for a single type.

I believe it's implied by these APIs behaving the same as the other overloads. I think it's more concerning that you'd switch from strings to spans and your use silently behaves differently.

If we really don't like the discrepancy, we should change these to be specific to char rather than T. But then keep in mind that if we ever added T-based overloads, you'd still have the exact same difference in behavior when specifying char vs other types, binding to one overload with one behavior vs the other with the other. (Also various APIs for spans already special-case char, e.g. span.ToString).

@bbartels
Copy link
Contributor Author

Was this discussed in api review? I have no recollection. I could see that being rather concerning behaviour special casing this for a single type.

I believe it's implied by these APIs behaving the same as the other overloads. I think it's more concerning that you'd switch from strings to spans and your use silently behaves differently.

If we really don't like the discrepancy, we should change these to be specific to char rather than T. But then keep in mind that if we ever added T-based overloads, you'd still have the exact same difference in behavior when specifying char vs other types, binding to one overload with one behavior vs the other with the other. (Also various APIs for spans already special-case char, e.g. span.ToString).

Fair point, I guess there is realistically no way around just supporting this quirk in some way.
Thanks for the review, let me have another look over to see if there is anything to spot. Otherwise let's get this (finally) merged 😄

@bbartels
Copy link
Contributor Author

bbartels commented Jul 15, 2024

@stephentoub I've had another look into simplifying things a little. Came up with this: https://github.com/bbartels/runtime/pull/1/files. Thoughts?
Upside:

  • Less work per iteration
  • Smaller struct

Downside:

  • Could be considered slightly more convoluted
  • Call to SpanSplitEnumerator.Current will throw if MoveNext() has not been called yet.
    The throwing should be fine under the IEnumerator.Current contract:
    image

@stephentoub
Copy link
Member

Thoughts?

We could. I have a small preference for what's currently in this PR just because I think it's a bit easier to understand.

@bbartels
Copy link
Contributor Author

bbartels commented Jul 15, 2024

We could. I have a small preference for what's currently in this PR just because I think it's a bit easier to understand.

Alright, up to you, I trust your judgement! Otherwise lgtm from my perspective

@stephentoub stephentoub force-pushed the SpanSplitEnumeratorNewFinalThisTimeForSure branch from e537a6e to d620771 Compare July 16, 2024 03:23
@stephentoub stephentoub requested a review from a team July 16, 2024 03:24
@stephentoub stephentoub merged commit c19c03e into dotnet:main Jul 18, 2024
139 of 146 checks passed
@stephentoub
Copy link
Member

Thanks, @bbartels.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Memory community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants