-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sensibly handle invalid strings across FFI #2534
Conversation
@@ -609,20 +609,25 @@ impl AnyCalendarKind { | |||
/// | |||
/// Returns None if the calendar is unknown | |||
pub fn from_bcp47_string(x: &str) -> Option<Self> { | |||
Self::from_bcp47_bytes(x.as_bytes()) | |||
} | |||
/// Construct from a BCP-47 byte string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and everywhere: Clarify the expected encoding, i.e. not UTF-16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think our plan is to expect std::string_view
is always UTF8 (though sometimes allowed to be ill-formed).
I don't want to doc the encoding that way because in JS there's no choice but using UTF16.
components/plurals/src/operands.rs
Outdated
if input.is_empty() { | ||
return Err(OperandsError::Empty); | ||
} | ||
|
||
let abs_str = input.strip_prefix('-').unwrap_or(input); | ||
let abs_str = input.strip_prefix(&[b'-'; 1]).unwrap_or(input); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: b"-"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to hit the fixed-size array codepath in a guaranteed way, though inlining probably means it doesn't matter
LGTM, but noting @sffc's previous terminology preference of well-formed/ill-formed over valid/invalid and various docs and identifiers in this PR using the latter at present. Thank you. |
Good point, yeah. I couldn't come up with what to name things, I'll let shane come up with a naming suggestion and uniformly appyl that everywhere. |
"potentially ill-formed UTF-8" is the most precise phrase to describe what we're working with based on my understanding from discussions with @markusicu. |
I've done some renaming and redoccing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First comments; still working
This implements the 1.0 plan laid out in #2520 (comment) for types that exist so far. The only thing missing is Normalizer, which doesn't have FFI yet (I'm working on that).
cc @hsivonen