-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AuxiliaryKey to DataLocale #3872
Conversation
I think this is ready modulo the decision on the internal data type for the auxiliary key. Did I miss anything? |
provider/core/src/request.rs
Outdated
pub struct AuxiliaryKey { | ||
// DISCUSS: SmallStr? TinyStrAuto? | ||
// DISCUSS: Make this a dynamically sized type so references can be taken? | ||
value: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be Cow<str>
to allow const-constructing with a data_locale!
macro. Also resizing doesn't seem to be use case, so there's no need for separate length and capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that Cow<str>
uses a String
in the owned variant.
How about something like
enum BoxOrStatic {
Box(Box<str>),
Static(&'static str),
Tiny(TinyAsciiStr<23>),
}
The size of that is 24 bytes, which is the same as a String
.
The main downside is that code which gets a &str
from that thing is going to need to do more branching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah it does, I somehow thought it'd be Box<str>
. That enum looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Manishearth Opinions on the enum before I spend time implementing it? I'm a little concerned about the branching and resulting code size but I haven't measured it yet.
provider/core/src/request.rs
Outdated
@@ -134,6 +139,10 @@ impl Writeable for DataLocale { | |||
sink.write_str("-u-")?; | |||
self.keywords.write_to(sink)?; | |||
} | |||
if let Some(aux) = self.aux.as_ref() { | |||
sink.write_char('$')?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$
has special meaning in string literals in languages we want to target, e.g. Dart. Is there a problem with /
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can probably make /
work but it doesn't automatically work in the sorting function strict_cmp
(the hot path for data lookup). It's easier to implement $
because '$' < '-'
. I could also use #
which has the same property.
If we decide that /
is what we really want, I'll make it work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the restriction is separator < -
, there are !#$%&'*+,
. My ranking of these is
+
or&
are great because the auxiliary key adds something to the data locale#
is a nice neutral separator!
,*
could be used as separators, but they're usually numeric operator%
and$
have the escaping issue'
and,
are too typographic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other options above -
would be @
, which we already use in data keys, and ;
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented +
as decided in #3632 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this. Same question as Rob about the separator
provider/core/src/request.rs
Outdated
/// a | ||
/// ); | ||
/// } | ||
/// ``` | ||
pub fn strict_cmp(&self, other: &[u8]) -> Ordering { | ||
// TODO: Make this work with aux keys | ||
let subtags = other.split(|b| *b == b'-'); | ||
let mut pipe_iter = other.split(|b| *b == b'$'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's put the aux key separator in a constant somewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by adding pub const fn AuxiliaryKey::separator()
Conflicts: provider/datagen/src/lib.rs
@robertbastian PTAL at the new multi-aux-key APIs. |
@@ -468,10 +468,46 @@ impl BakedExporter { | |||
let mut map = BTreeMap::new(); | |||
let mut statics = Vec::new(); | |||
|
|||
#[derive(Copy, Clone)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like overkill compared to
&first_locale.to_ascii_uppercase().replace('-', "_").replace(AuxiliaryKey::separator() as char, "__")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acknowledged but my version saves an allocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the complexity tradeoff here is not worth it for datagen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking of Either::<[char], [char, char]>
which is concise but we don't have an Either
dep
Fixes #3632