-
-
Notifications
You must be signed in to change notification settings - Fork 59
Description
Currently, in 0.1 of bstr, the primary way to use and manipulate byte strings is with the BString
(owned, growable) and BStr
(borrowed slice) types. However, a major alternative design to using explicit types is to define extension traits that add more methods to the Vec<u8>
and &[u8]
types.
Reflecting back, I don't think I quite gave the extension trait path a thorough review before going with the explicit BString
/BStr
types. In particular, I perceive few key advantages to using explicit types:
- Having distinct types provides some "semantic" meaning that the bytes should be treated as a string rather than just an arbitrary collection of bytes.
- Have a convenient
Debug
representation that prioritizes the "stringness" ofBString
/BStr
over the "slice ofu8
" representation shown forVec<u8>
/&[u8]
. For example,"abc"
instead of[97, 98, 99]
. - As a riff on (2), there may be other traits that one wants to implement "specially" for byte strings as opposed to "slice of
u8
." Serde comes to mind here.
If (1) were the only benefit, I think I could be persuaded to drop that line of reasoning, although it does appeal to me aesthetically. However, in my view, (2) is a fairly significant benefit, and it's one of the most important ergonomic improvements that I look forward to whenever I bring bstr
in to one of my crates. Otherwise, I fairly consistently define my own helper functions for printing out byte strings when I don't have bstr
, and it's honestly a pain. Especially when Vec<u8>
/&[u8]
are part of some other type.
With that said, in the course of actually using bstr
in crates, I've come to the belief that using extension traits would make using string oriented APIs much more seamless and more ergonomic overall, with the notable exception of the aforementioned problems with the debug representation. In particular, using BString
/BStr
often requires annoying conversion routines between Vec<u8>
/&[u8]
. e.g., Most of the raw I/O APIs in std want a &[u8]
, so you wind up needing to write my_byte_string.as_bytes()
quite a bit, which is annoying.
Moreover, using BString
/BStr
really motivates one to use them in as many places as possible, because of aforementioned annoying conversions. But this isn't always desirable, because you might want to expose APIs in terms of &[u8]
for various reasons, including, but not limited to, not adding a public dependency on bstr
. If we were using extension traits instead, then you could just import the traits and start using the new APIs immediately.
One possible alternative to this would to implement Deref
and DerefMut
for BString
/BStr
, which would eliminate the various as_bytes()
calls, but you still need to explicitly construct byte strings. Moreover, this kind of feels like an abuse of Deref
.
Another benefit of extension traits is that the API surface area of bstr
could be quite a bit smaller, since many of the methods on BString
/BStr
are just re-exports of methods by the same name on Vec<u8>
/&[u8]
.
Overall, my sense is that this crate would be better if it used extension traits. To mitigate (but not completely solve) my Debug
problem, we could keep the BString
/BStr
types, but remove all of their inherent methods, make them implement Deref
and add an appropriate Debug
impl. You still have to explicit convert from Vec<u8>
/&[u8]
, which is a little annoying, but I expect their use would be more limited and the Deref
impl would make them overall more convenient to use.
Obviously, this is a fairly large breaking change to the current API, but given the only consumer (that I know of) is me, I think it's okay to do this. The library is called an experiment after all, and if we're going to make this change, then now would be the time to do it.
Pinging @joshtriplett and @eddyb, who I believe might have thoughts on this. (Please ping others that you think might have an opinion on this.)