Skip to content

consider using extension traits on Vec<u8>/&[u8] #5

@BurntSushi

Description

@BurntSushi

Currently, in 0.1 of bstr, the primary way to use and manipulate byte strings is with the BString (owned, growable) and BStr (borrowed slice) types. However, a major alternative design to using explicit types is to define extension traits that add more methods to the Vec<u8> and &[u8] types.

Reflecting back, I don't think I quite gave the extension trait path a thorough review before going with the explicit BString/BStr types. In particular, I perceive few key advantages to using explicit types:

  1. Having distinct types provides some "semantic" meaning that the bytes should be treated as a string rather than just an arbitrary collection of bytes.
  2. Have a convenient Debug representation that prioritizes the "stringness" of BString/BStr over the "slice of u8" representation shown for Vec<u8>/&[u8]. For example, "abc" instead of [97, 98, 99].
  3. As a riff on (2), there may be other traits that one wants to implement "specially" for byte strings as opposed to "slice of u8." Serde comes to mind here.

If (1) were the only benefit, I think I could be persuaded to drop that line of reasoning, although it does appeal to me aesthetically. However, in my view, (2) is a fairly significant benefit, and it's one of the most important ergonomic improvements that I look forward to whenever I bring bstr in to one of my crates. Otherwise, I fairly consistently define my own helper functions for printing out byte strings when I don't have bstr, and it's honestly a pain. Especially when Vec<u8>/&[u8] are part of some other type.

With that said, in the course of actually using bstr in crates, I've come to the belief that using extension traits would make using string oriented APIs much more seamless and more ergonomic overall, with the notable exception of the aforementioned problems with the debug representation. In particular, using BString/BStr often requires annoying conversion routines between Vec<u8>/&[u8]. e.g., Most of the raw I/O APIs in std want a &[u8], so you wind up needing to write my_byte_string.as_bytes() quite a bit, which is annoying.

Moreover, using BString/BStr really motivates one to use them in as many places as possible, because of aforementioned annoying conversions. But this isn't always desirable, because you might want to expose APIs in terms of &[u8] for various reasons, including, but not limited to, not adding a public dependency on bstr. If we were using extension traits instead, then you could just import the traits and start using the new APIs immediately.

One possible alternative to this would to implement Deref and DerefMut for BString/BStr, which would eliminate the various as_bytes() calls, but you still need to explicitly construct byte strings. Moreover, this kind of feels like an abuse of Deref.

Another benefit of extension traits is that the API surface area of bstr could be quite a bit smaller, since many of the methods on BString/BStr are just re-exports of methods by the same name on Vec<u8>/&[u8].

Overall, my sense is that this crate would be better if it used extension traits. To mitigate (but not completely solve) my Debug problem, we could keep the BString/BStr types, but remove all of their inherent methods, make them implement Deref and add an appropriate Debug impl. You still have to explicit convert from Vec<u8>/&[u8], which is a little annoying, but I expect their use would be more limited and the Deref impl would make them overall more convenient to use.

Obviously, this is a fairly large breaking change to the current API, but given the only consumer (that I know of) is me, I think it's okay to do this. The library is called an experiment after all, and if we're going to make this change, then now would be the time to do it.

Pinging @joshtriplett and @eddyb, who I believe might have thoughts on this. (Please ping others that you think might have an opinion on this.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions