Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement nth, count, and last for EscapeUnicode #31049

Closed
wants to merge 10 commits into from

Conversation

ranma42
Copy link
Contributor

@ranma42 ranma42 commented Jan 20, 2016

and align the implementations in EscapeUnicode and EscapeDefault

Part of #24214.

EDIT: corrected reference to issue.

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@ranma42
Copy link
Contributor Author

ranma42 commented Jan 20, 2016

AFAICT this should allow the last tick missing in #30520 to be set.

Some care has been taken in order to avoid code duplication (between next and nth impls) and to avoid a quadratic number of cases (by "fast seeking" based on the number of remaining items).

WARNING: this PR marks EscapeUnicode and EscapeDefault as ExactSizeIterator. The constraints for the trait implementations hold even before this PR, but I am not sure if this is something we want to guarantee/expose (I would love feedback on this, especially on what would be the appropriate way to handle stabilisation, if needed).

Additional tests are missing; I will take care of this ASAP.

Done,
RightBrace,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a micro-optimization (so that state_len is a single load instead of load+add).
I can make it more explicit and/or reference the comment in state_len or I can remove it if it is deemed undesirable.

The `offset` value was computed both in `next` and in `size_hint`;
computing it in a single place ensures consistency and makes it easier
to apply improvements. The value is now computed as soon as the
iterator is constructed. This means that the time to compute it is
spent immediately and cannot be avoided, but it also guarantees that
it is only spent once.
Instead of iteratively scanning the bits, use `leading_zeros`.
In rust-lang#28662, `size_hint` was made exact for `EscapeUnicode` and
`EscapeDefault`, but neither was marked as `ExactSizeIterator`.
Trivial implementation, as both are `ExactSizeIterator`s.

Part of rust-lang#24214.
Extract a function that updates the iterator state and returns the
result of an arbitrary step of iteration.

This implements the same logic as `next`, but it can be shared with
`nth`.
as a step from the appropriate state.

Part of rust-lang#24214.
This makes it easier to have a unique path for handling all of them.
by extracting a shared `step` function.
@alexcrichton
Copy link
Member

This seems pretty complicated for a ... pretty minor iterator, out of curiosity was this motivated beyond ensuring that the implementations are specialized for all iterators?

@ranma42
Copy link
Contributor Author

ranma42 commented Jan 21, 2016

It is basically a continuation of #30624, mainly to finish #24214.

There are several changes because the PR includes some improvements to the current state of EscapeUnicode and EscapeDefault (faster computation of offset, code reuse/unification).

Single commits should be easier to understand, but I can also split it into multiple PRs if it is more convenient.

EDIT: wrong ref

@ranma42
Copy link
Contributor Author

ranma42 commented Jan 28, 2016

Abandoned in favour of (multiple) simpler PRs.

@ranma42 ranma42 closed this Jan 28, 2016
bors added a commit that referenced this pull request Apr 20, 2016
Improve computation of offset in `EscapeUnicode`

Unify the computation of `offset` and use `leading_zeros` instead of manually scanning the bits.
This PR removes some duplicated code and makes it a little simpler .
The computation of `offset` is also faster, but it is unlikely to have an impact on actual code.

(split from #31049)
bors added a commit that referenced this pull request May 19, 2016
Implement `last` for `EscapeUnicode`

The implementation is quite trivial as the last character is always `'{'`.
As a side-effect it also improves the implementation of `last` for `EscapeUnicode`.

Part of #24214, split from #31049.

Maybe this (and the other changes that I will split from #31049) should wait for a test like `ed_iterator_specializations` to be added. Would it be sufficient to do the same for each possible escape length?
Manishearth added a commit to Manishearth/rust that referenced this pull request May 28, 2016
…richton

Implement `count` for `EscapeUnicode`

and cleanup the code for `count` for `EscapeDefault` (instead of repeating the `match` for `size_hint` and `count`).

This PR marks EscapeUnicode and EscapeDefault as ExactSizeIterator. The constraints for the trait implementations held even before this PR, but I am not sure if this is something we want to guarantee/expose (I would love feedback on this, especially on what would be the appropriate way to handle stabilisation, if needed).

Part of rust-lang#24214, split from rust-lang#31049.

The test for `count` was added in rust-lang#33103.
@ranma42 ranma42 deleted the escape-iters-take3 branch June 6, 2016 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants