-
Notifications
You must be signed in to change notification settings - Fork 1k
[Variant] Introduce new type over &str for ShortString #7718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
043f082
to
a06720b
Compare
fea5ecd
to
df42f00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me -- thank you @friendlymatthew . I left a few comments but we can do that as follow on PRs if you prefer
Note this is very likely to conflict with
df42f00
to
a6ad4e5
Compare
FYI @scovich perhaps you could offer your opinion on this PR as well |
e713fd4
to
2a62499
Compare
parquet-variant/src/variant.rs
Outdated
/// | ||
/// This constructor verifies that `value` is shorter than or equal to `MAX_SHORT_STRING_SIZE` | ||
pub fn try_new(value: &'a str) -> Result<Self, ArrowError> { | ||
if value.len() > MAX_SHORT_STRING_SIZE { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a comment here that we are indeed supposed to check bytes and not characters, that's a common confusion with "string length"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a great idea -- maybe we can even name the constant MAX_SHORT_STRING_BYTES
to make it more self describing
dae45e6
to
09af8ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
impl<'a> From<ShortString<'a>> for &'a str { | ||
fn from(value: ShortString<'a>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit... unorthodox? Would impl Deref
be more traditional, to go with that impl AsRef
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense
Thanks again @friendlymatthew and @scovich and @adriangb |
Which issue does this PR close?
This commit introduces
ShortString
, a newtype that wraps around&str
that enforces a maximum length constraint. This also allows us to perform validation once and removes a superfluous validation check inappend_value
.The now-superflous validation check was needed since users could construct
Variant::ShortString
s directly, without doing input validation. This means you can have a short string variant which actually contains a string that is no longer than 63 bytes.But since we enforce this check upon construction, we can directly match against
Variant::String
andVariant::ShortString
arms with their respective appending functions (append_string
andappend_short_string
).