Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand ShortByteStrings API? #83

Closed
hce opened this issue Jun 23, 2016 · 9 comments · Fixed by #471
Closed

Expand ShortByteStrings API? #83

hce opened this issue Jun 23, 2016 · 9 comments · Fixed by #471

Comments

@hce
Copy link

hce commented Jun 23, 2016

I'm using ShortByteStrings to convert from (STUArray Int Word8). Having hPut available would save the memory allocation and memcpy that happens when converting a ShortByteString into a ByteString.

Would you accept a patch that adds this function to ShortByteStrings?

@hvr
Copy link
Member

hvr commented Jun 23, 2016

I/O functions that may block (and therefore are marked safe) typically require pinned memory afaik, so I don't think we can avoid the allocation & memcpy anyway...

@hce
Copy link
Author

hce commented Jun 24, 2016

Even at a time when the STUArray is frozen and has left the ST monad?

@dcoutts
Copy link
Contributor

dcoutts commented Sep 1, 2016

@hce I think I might but it's not going to be easy, since System.IO.hPutBuf uses Ptr so you'd have to go deeper into the internals. Also, are you sure this is your performance problem? Writing a very short string into a handle is not going to be quick anyway since we have to take the Handle lock and do various other checks. If you're writing a lot of ShortByteString then you're much better off using the bytestring Builder which builds up big chunks and can flush them in one go (often bypassing the Handle buffer entirely).

@hce
Copy link
Author

hce commented Sep 1, 2016

What I need the mutable array for is the deflate algorithm that I'm implementing. I'm open to alternative suggestions for how to best implement this. :)

It is not the only bottleneck, but the conversion does appear to take a non-negligible amount of time.

Do you think "converting" the individual Word8 values to a Builder would be more efficient than the memcpy of ShortByteString into ByteString?

@dcoutts
Copy link
Contributor

dcoutts commented Oct 11, 2016

Do you think "converting" the individual Word8 values to a Builder would be more efficient than the memcpy of ShortByteString into ByteString

No, a memcpy is as fast as it gets. Using a builder is also copying, just slower.

@dcoutts
Copy link
Contributor

dcoutts commented Oct 11, 2016

So, I have to say I'm back in a state of confusion. I'm not really clear what you're doing, or what you think would make things faster. We should have some clear statement of a problem and a plausible looking solution before anyone implements anything here.

@winterland1989
Copy link

I propose doing the opposite: adding processing functions to ShortByteString, usually i have a workflow like: IO ByteString ---> Some Parser ---> DataStructure Contain ShortByteString, but if i want to further process these DataStructures, i got stuck since there's not so many combinators out there.

One problem is the combinators explosion, a possible way i can think of is to use type ShortByteString = Vector Word8, and reuse all the good things in Data.Vector.Unboxed, or maybe it's better to make a new package?

@Bodigrim Bodigrim changed the title Add IO functions for ShortByteStrings? Expand ShortByteStrings API? Oct 13, 2020
@Bodigrim
Copy link
Contributor

I'm afraid that expanding ShortByteString API will inevitable raise a question about Char8 interface and lazy ShortByteString (at this point it won't be short anymore, but rather a ByteString backed by ByteArray, cf. #193). And we are quite bad at providing a consistent API even for four existing ByteString flavours (#289).

@hasufell
Copy link
Member

hasufell commented Sep 3, 2021

I created: https://github.com/hasufell/shortbytestring

It's also on hackage: https://hackage.haskell.org/package/shortbytestring

Reviews welcome. It is currently used in the abstract-filepath library, which is WIP and will heavily rely on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants