Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constant-memory IArray packing #52

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bradediger
Copy link

Currently, the IArray serializer delegates to elems and the list serializer. Unfortunately, this results in a second full copy of the array, as the list serializer needs to walk the list (holding the head) to find its length. This is necessary for arbitrary lists, but not for arrays where we know the dimensions beforehand.

This change does away with the extra length field; it outputs the bounds and then spits elements one by one, in array (Ix) order. The parser was similarly changed. In the meantime, I also added a small round-trip test for the array serialization functionality.

As I was writing this, I just realized that this is a breaking change to the IArray serialization format, as it removes the superfluous Word64 length. If backwards compatibility is desired, we could calculate and add that field back in so the format is identical.

Here is the test script I used to verify the before/after memory behavior:

module Main where

import Data.Array.Unboxed
import Data.ByteString.Builder
import Data.Serialize
import System.IO

wholeLottaNothing :: UArray Int Int
wholeLottaNothing = listArray (0,10000000) (repeat 0)

main :: IO ()
main = withFile "/dev/null" AppendMode $ \sink ->
  hPutBuilder sink . execPut $ put wholeLottaNothing

The IArray serializer previously used, effectively, `putListOf . elems`
to serialize immutable arrays. Since the bounds uniquely define the
length of the list, eliminating one length call allows the list to be
partially GCd during serialization, so large lists take no additional
space.
@bradediger
Copy link
Author

The Travis failure reminds me that I have really internalized the base-4.8 changes to Prelude. I'll get that fixed up!

@elliottt
Copy link
Contributor

Thanks for the PR! I'm not opposed to breaking changes, especially when they come with such nice performance benefits!

@bradediger
Copy link
Author

@elliottt Looks like we're green now.

As for the breaking change, totally up to you but if backwards compatibility is important, we have the possibility to attain it by adding that (unnecessary) length field back in.

Thanks for the quick response!
-be

@elliottt
Copy link
Contributor

I like the suggestion to add the length field back in, even if it's not used. If you make that change, I'll happily merge this :)

@bradediger
Copy link
Author

You know what, I'm digging a bit further into this and I'm not sure I've slain the beast on the Get side (Put is fine as far as I can tell). I'll add both to my list and update this PR.

@bradediger
Copy link
Author

@elliottt A brief status update on this work... it's been interesting but the rabbit hole goes deep here. This is half keeping you updated and half trying to rubber-duck debug the situation I'm in by explaining it.

The fundamental problem I'm seeing with the Get instances for arrays is that they use lists. As far as I can tell, lists can't be built lazily in the current interface because they require building the list backwards monadically and then reversing it. That's a non-starter since I'm trying to reduce this code to work in constant space. (To motivate my problem here, I'm trying to stream 2+ GB arrays from memory to disk and back, and a 2 GB unboxed Double array is a lot larger as a boxed Haskell list.)

So I've reduced the problem to eliminating the list construction from the Array deserializer. My current angle of attack is to stream the entries into an array incrementally under ST, so there is no intermediate list built up. The problem is that there is currently no way to compose ST and Get.

I had success with this approach using the STT transformer from STMonadTrans, so the computation was of type STT s (Get (STArray s i e)). It did work, and solved all of my space problems, but the approach left a bad taste in my mouth. First of all, the STT laws are brittle and the failure mode is silently cloning ST state (exactly the sort of problem I got into Haskell to avoid!). But also, there's the pragmatic problem that STMonadTrans only offers instances for boxed arrays, and I don't want to add another level to my yak shave.

So that leaves the option of trying to turn Get into a monad transformer GetT, such that type Get = GetT Identity and arrays are deserialized via GetT (ST s) (STArray s i e). There's something like this in cereal-plus as Cereal.Deserialize, but it seems to me that Get could be defined in terms of this. I've started playing with this idea, but I'm not far enough in to know whether it is a good idea or even fully possible.

Do you have any thoughts, positive or negative, about this approach? Am I overthinking this? Thanks!

@elliottt
Copy link
Contributor

elliottt commented Mar 1, 2016

I'm a little reluctant to make Get into a transformer, but if you've got a branch that implements this and doesn't induce performance regressions, I'd consider it. It's been a requested feature in the past, so that parsing could be done in IO, for example.

Do you have thoughts on what the changes to the Serialize class would look like? Would you need to turn Put into a transformer as well, to allow putting IOArrays, and so forth?

@bradediger
Copy link
Author

Yeah, I was concerned about performance regressions too... it's too early for me to tell whether this is going to be a good idea.

I have some changes to Get.hs that implement parts of this change, but I haven't gotten all of the types to fit together yet. I'm working through all of the downstream implications of what this would mean, and unfortunately I am working at the edge of (and stretching) my intuitive understanding of monad transformers now.

So far, the big change is that Result depends on the base monad (data Result m r), and its constructor Partial changes so that its continuation runs inside the base monad: Partial (B.ByteString -> m (Result m r)). This leads to all parsing functions returning results inside the base monad as well (e.g., runGet :: Monad m => GetT m a -> B.ByteString -> m (Either String a).

My hope is that when this is done, GHC will be smart enough to optimize away the monadic binds when running under Identity, or that failing that, RULES could fuse some of the intermediates away, but I am several steps away from being able to even test that hypothesis yet.

I didn't think much about the desire for a parallel PutT transformer, which it does seem would be necessary if you're working with IOArrays. It's been less of a pressing need for me, as the earlier commit b80e863 enables STArrays to be packed in constant space via lists -- an approach that seems impossible to replicate on the Get side due to the list-reversing behavior. If we found a solution for how to stream lazy lists out of Get without holding their head, I would be very interested in that and it would solve this issue without monad transformers.

Thanks again for your time! I'll let you know if/when I have further results to show on the GetT approach.

@Ongy
Copy link
Contributor

Ongy commented Oct 22, 2016

Kind of hijacking this PR for the discussion in it.
Has any progress been made on the transformer? I have also come to the situation where I would like to use cereal with an added state.
Searching for a solution I thought about the GetT version and found this issue.

@bradediger as I understand it, you have done some work on that, but didn't get it "ready"? Can you share your current work?

@bradediger
Copy link
Author

@Ongy Unfortunately I have nothing else to show on the GetT approach. I had worked around this problem in my own code for the time being (my original problem was Getting and Putting IArray values in no additional space), so the motivation for my adding a transformer version (being able to Get values in a transformer on ST) is not there anymore and I haven't been working on this.

My approach was a fairly heavy-handed one; it scrapped the Get instance in favor of a generic GetT transformer and type Get = GetT Identity. So as mentioned above, even if this approach had been fully fleshed out, it would have required significant benchmarking against the existing implementation to ensure we didn't regress performance.

This was the skeleton of the interface I started with -- note that Partial's continuation and Done's return value are embedded in the base monad.

data Result m a =
    Fail Text ByteString
  | Partial (ByteString -> m (Result m a))
  | Done (m a) ByteString

newtype GetT m a = GetT {
  -- | Feed a chunk of data and get a partial result.
  runPartial :: ByteString -> m (Result m a)
  }

Unfortunately, I basically only got as far as chasing these type definitions through Get.hs before giving up on this approach. I still think it's promising, but it ended up surpassing the time I had available.

I think the main hard question that needs to be answered is whether the Get and GetT approaches are compatible and can share code without slowing down performance for users who don't need the transformer machinery.

Happy to answer additional questions about my approach, but this is basically what I remember at this point.

@elliottt Happy to move this to a mailing list or somewhere else if this thread isn't the appropriate place for this discussion. Thanks!

@elliottt
Copy link
Contributor

I don't mind the discussion here, it's a nice record for anyone else that's interested in the transformer approach :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants