Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: Stabilize more of the char module #23126

Merged
merged 1 commit into from
Mar 11, 2015

Conversation

alexcrichton
Copy link
Member

This commit performs another pass over the std::char module for stabilization.
Some minor cleanup is performed such as migrating documentation from libcore to
libunicode (where the std-facing trait resides) as well as a slight
reorganiation in libunicode itself. Otherwise, the stability modifications made
are:

  • char::from_digit is now stable
  • CharExt::is_digit is now stable
  • CharExt::to_digit is now stable
  • CharExt::to_{lower,upper}case are now stable after being modified to return
    an iterator over characters. While the implementation today has not changed
    this should allow us to implement the full set of case conversions in unicode
    where some characters can map to multiple when doing an upper or lower case
    mapping.
  • StrExt::to_{lower,upper}case was added as unstable for a convenience of not
    having to worry about characters expanding to more characters when you just
    want the whole string to get into upper or lower case.

This is a breaking change due to the change in the signatures of the
CharExt::to_{upper,lower}case methods. Code can be updated to use functions
like flat_map or collect to handle the difference.

[breaking-change]

Closes #20333

@rust-highfive
Copy link
Collaborator

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member Author

r? @aturon
r? @SimonSapin

@aturon aturon mentioned this pull request Mar 6, 2015
91 tasks
@alexcrichton alexcrichton force-pushed the char-third-pass branch 2 times, most recently from b0862e5 to 6d6cbbd Compare March 6, 2015 19:46
@SimonSapin
Copy link
Contributor

CharExt::to_{lower,upper}case are now stable after being modified to return an iterator over characters. While the implementation today has not changed this should allow us to implement the full set of case conversions in unicode where some characters can map to multiple when doing an upper or lower case mapping.

Does "stable" mean we can still change their behavior after 1.0?

@alexcrichton
Copy link
Member Author

Does "stable" mean we can still change their behavior after 1.0?

I believe that we reserve the right to update the unicode standard we're using, which can add new case mappings for existing characters. Along those lines, I see actually parsing the extra data and returning many characters as a similar enhancement.

All in all yes, I believe that this is a change we can make after 1.0

///
/// If the buffer is not large enough, nothing will be written into it
/// and a `None` will be returned.
/// In both of these examples, 'ß' takes one byte to encode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one u16

@alexcrichton
Copy link
Member Author

Pushed some updates, thanks for the comments @tbu- and @SimonSapin!

@SimonSapin
Copy link
Contributor

LGTM.

@aturon
Copy link
Member

aturon commented Mar 10, 2015

@bors: r+ d33b308

@aturon
Copy link
Member

aturon commented Mar 10, 2015

Note: this closes #20333

@SimonSapin
Copy link
Contributor

Note: this closes #20333

Returning Iterator<Item=char> is a better (more idiomatic) solution than the Option<&'static str> that I suggested in that bug. (Maybe String::to_{upper,lower}case can gain some performance by not re-encoding to UTF-8 (though that remains to be proven) but it can still do so internally without affecting the API of the char methods.)

@aturon
Copy link
Member

aturon commented Mar 10, 2015

@SimonSapin Oh, I agree, but I still take this to be addressing the underlying issue of that bug.

@SimonSapin
Copy link
Contributor

My message was a long way of saying “+1” :)

@bors
Copy link
Contributor

bors commented Mar 10, 2015

⌛ Testing commit d33b308 with merge e840e74...

@bors
Copy link
Contributor

bors commented Mar 10, 2015

💔 Test failed - auto-linux-32-opt

@alexcrichton
Copy link
Member Author

@bors: r+

@bors
Copy link
Contributor

bors commented Mar 10, 2015

@bors r=alexcrichton e74a79a

@alexcrichton
Copy link
Member Author

@bors: r=aturon

@bors
Copy link
Contributor

bors commented Mar 10, 2015

@bors r=aturon e74a79a

@bors
Copy link
Contributor

bors commented Mar 10, 2015

⌛ Testing commit e74a79a with merge fde1502...

@bors
Copy link
Contributor

bors commented Mar 10, 2015

💔 Test failed - auto-win-32-opt

@alexcrichton
Copy link
Member Author

@bors: r=aturon 3bcd209

This commit performs another pass over the `std::char` module for stabilization.
Some minor cleanup is performed such as migrating documentation from libcore to
libunicode (where the `std`-facing trait resides) as well as a slight
reorganiation in libunicode itself. Otherwise, the stability modifications made
are:

* `char::from_digit` is now stable
* `CharExt::is_digit` is now stable
* `CharExt::to_digit` is now stable
* `CharExt::to_{lower,upper}case` are now stable after being modified to return
  an iterator over characters. While the implementation today has not changed
  this should allow us to implement the full set of case conversions in unicode
  where some characters can map to multiple when doing an upper or lower case
  mapping.
* `StrExt::to_{lower,upper}case` was added as unstable for a convenience of not
  having to worry about characters expanding to more characters when you just
  want the whole string to get into upper or lower case.

This is a breaking change due to the change in the signatures of the
`CharExt::to_{upper,lower}case` methods. Code can be updated to use functions
like `flat_map` or `collect` to handle the difference.

[breaking-change]
@alexcrichton
Copy link
Member Author

@bors: r=aturon 0f6a0b5

bors added a commit that referenced this pull request Mar 10, 2015
This commit performs another pass over the `std::char` module for stabilization.
Some minor cleanup is performed such as migrating documentation from libcore to
libunicode (where the `std`-facing trait resides) as well as a slight
reorganiation in libunicode itself. Otherwise, the stability modifications made
are:

* `char::from_digit` is now stable
* `CharExt::is_digit` is now stable
* `CharExt::to_digit` is now stable
* `CharExt::to_{lower,upper}case` are now stable after being modified to return
  an iterator over characters. While the implementation today has not changed
  this should allow us to implement the full set of case conversions in unicode
  where some characters can map to multiple when doing an upper or lower case
  mapping.
* `StrExt::to_{lower,upper}case` was added as unstable for a convenience of not
  having to worry about characters expanding to more characters when you just
  want the whole string to get into upper or lower case.

This is a breaking change due to the change in the signatures of the
`CharExt::to_{upper,lower}case` methods. Code can be updated to use functions
like `flat_map` or `collect` to handle the difference.

[breaking-change]

Closes #20333
@bors
Copy link
Contributor

bors commented Mar 10, 2015

⌛ Testing commit 0f6a0b5 with merge cfea8ec...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Char::to_{lower,upper}case should return Option<&'static str> instead of char
6 participants