Zero-Copy Serialization/Deserialization #5

somethingelseentirely · 2024-05-19T14:33:45Z

The pointer-free nature of succinct data-structures makes them very amenable to (de)serialization by simply casting their memory to/from a bunch of bytes.

Not only would this remove most (de)serialization costs, it could also enable very fast and simple on-disk storage when combined with mmap.

One might want to implement this via rkyv, but simply providing a safe transmute to and from bytes::Bytes (with a potential rkyv implementation on top of that) might be the simpler, more agnostic solution.

The text was updated successfully, but these errors were encountered:

Cydhra · 2024-05-19T20:17:57Z

rkyv looks good, and adding that with an optional dependency (because I am quite keen on keeping it zero-dependencies) might be an option. I'll look further into it, because it seems like a nice additon.
Doing it myself by just casting raw data sounds painful though, because while rkyv offloads endianess somewhere into the frontend (i.e. the user has to decide what to do if I read it correctly), I'd have to handle that if I implement it myself (even if I do the same, I still have to keep it in mind). Maybe I am overthinking that though, I don't know.

somethingelseentirely · 2024-05-20T12:29:57Z

Well if you're using the (de)serialization as way to create a data-exchange/file format (think .jpg not application instance specific .dat), then that format will want to decide on some endianess. In my case it's a file format for knowledge graphs, with the added bonus that you can query it without having to build any indexes first, just mmap and go. So it's always going to be in be.

The Stable Cross-Platform Database File section in the SQLite documentation is probably the best description of that use case. Avoiding breaking changes caused by the way rkyv stores things is also an argument for rolling our own framework agnostic data layout.

Edit: Btw it's also completely fine if such a use-case doesn't align with the project goals 😄

Cydhra · 2024-05-21T19:57:47Z

I pushed some changes to a new branch dev_zero_copy.

All functionality of RsVec is now moved to traits that abstract over the data layout. This way it should be possible to generate an ArchivedRsVec (gated behind a crate-feature) that also implements the trait.

As it stands, this constitutes a breaking change, because you now have to import the trait to access methods on RsVec. This could be fixed by renaming the trait methods and then providing convenience functions in RsVec.

somethingelseentirely · 2024-05-22T14:21:59Z

Awesome, I'll check it out asap!

somethingelseentirely · 2024-08-30T17:21:21Z

I just started some work on a minimalist approach to this, using a combination of techniques from minibytes to allow for extensible Bytes definitions (to allow for byte sources like mmaped files), and zerocopy to be able to use Vec<T> as a zerocopy byte source when T itself conforms to the FromBytes/AsBytes traits from zerocopy.

This would still require some form of archive format (i.e. a header with length and layer count), but most of the data (data/blocks/...) can just be sliced from the binary data source.

The main reason I'm writing this comment though is that I've noticed that

pub struct WaveletMatrix {
    data: Box<[RsVec]>,
    bits_per_element: u16,
}

appears to have the information about the number of layers redundantly.
Isn't bits_per_element == data.len(), in which case it would be sufficient to store only the data smartpointer?

Is this an atavism from a version in which data was only a single RsVec like most other implementations that I've seen? In which case my guess would be that the split is because of performance as it allows you to remove some counting ops in the new layers?

Really awesome implementation btw, I'm super impressed. It smells like you were able to justify working on it as part of your PhD 😆

somethingelseentirely · 2024-09-01T09:22:04Z

I've come to the conclusion that APIs should in general return usize*

*but use specific sizes in the structs 😆

Could we fixate the SuperBlockDescriptor and SelectSuperBlockDescriptor fields from usize to concrete types?

SuperBlockDescriptor zeroes should also fit into an u16 with the current?

const SUPER_BLOCK_SIZE: usize = 1 << 13;

The SelectSuperBlockDescriptor is a bit trickier, (2^(32 + 13)) bits = 4.39804651 terabytes, that's right on the edge of being reasonably large enough for encoding all real world consecutive runs of 0s or 1s, but it's also small enough that one can construct a theoretical counter example on a supercomputer or in a mmaped file. So it might be better to err on the side of caution and use an u64?

The SelectSuperBlockDescriptor is also the only thing that's giving me a bit of a headache right now in terms of serialisation because it's the only thing where the size is dependent on the data distribution and not only the length of the bitvector. 😅

Lastly the fields len, rank0, rank1 of RsVec should all be u64.

Btw this should read

/// The indices do not point into the bit-vector, but into the ~~select~~super-block vector.

right?

Cydhra · 2024-09-01T10:01:49Z

SuperBlockDescriptor zeroes should also fit into an u16 with the current?

No, the counter isn't reset between super-blocks, lest you'd need to generate a prefix-sum of all super-blocks before the query.

So it might be better to err on the side of caution and use an u64?

Lastly the fields len, rank0, rank1 of RsVec should all be u64.

Yes. I'll add it to #9.

Btw this should read...

Yes

The SelectSuperBlockDescriptor is also the only thing that's giving me a bit of a headache right now in terms of serialisation because it's the only thing where the size is dependent on the data distribution and not only the length of the bitvector.

Can't you just add a value to the file header that tells the loading code how big the slice is?

somethingelseentirely · 2024-09-01T13:35:32Z

No, the counter isn't reset between super-blocks, lest you'd need to generate a prefix-sum of all super-blocks before the query.

Ah gotcha, sorry u64 it is then 😄

Yes

I'll create a PR

Can't you just add a value to the file header that tells the loading code how big the slice is?

True, it's just something that needs to be stored per layer, so Header | Blocks becomes something like Header | PerLevelHeader | Blocks, not really a big deal though 😅.

Btw which memory layout do you prefer? Storing the Block/SuperBlock/...s so that each level is stored consecutively [([B], [SB], [SSB])], or by block type so that each type is stored consecutively regardless of level origin([B], [SB], [SSB]). My hunch is that there is more locality level-wise for small wavelet matrixes? But if you had something like slab allocation planned for the future, it might make sense to have things with the same size in the same allocation?

Cydhra · 2024-09-02T12:02:17Z

I just misread your question 😅.

I don't plan on adding slab allocation or anything, because this isn't a dynamic data structure, and thus it shouldn't be worth messing with allocation here.

The only thing you should keep in mind is that I do plan on merging blocks and super_blocks into a singular structure and vector (#9), since you only need to access the blocks inside one super block, and thus you can increase locality with that.

I also don't think changing the layout here gets any improvements in locality, because for that, multiple block-lists would have to share cache lines, which shouldn't happen for non-trivial vectors. So for simplicity, [([B], [SB], [SSB])] sounds better.

somethingelseentirely · 2024-09-02T12:31:12Z

Gotcha 😄!

I do see how that's potentially nice for rank, but wouldn't it significantly reduce the locality in select operations though, because there you have to binary and linearly search through the super_blocks?

Cydhra · 2024-09-02T12:44:45Z

Good point, that needs to be evaluated during implementation

somethingelseentirely · 2024-09-10T06:01:02Z

I did some thinking. Maybe it's not a good idea to try to provide the zerocopy serialization/deserialization as a pre-packaged one-size-fits-all solution. Someone designing a on-disk file format, will have thoughts and insights about the layout themselves. For example if you store multiple wavelet-matrixes with the same alphabet but different orderings, you can share that information between them, whereas an implementation provided by us would have to replicate that into every serialised instance.

So I think it might be better to make the internals part of the public interface, and expose constructors that can take something like the generic read-only PackedSlice<...>s.

I mean the cool thing about succinct vectors is that they are somewhat simple Data-structures, they are somewhat canonical by construction (modulo hyperparameters like superblock ranges).

Cydhra · 2024-09-15T14:55:27Z

so you suggest just a from-raw-parts layer, maybe some conversions with common libs and then let downstream crates handle it?

somethingelseentirely · 2024-09-15T20:40:31Z

Yeah exactly. Focusing on making the raw parts stable and zerocopy seems to be the better time-invest, compared to trying to create a full collection type that's just gonna be part of some larger thing in a downstream crate anyways.
I mean low-level stuff like succinct data-structures don't really exist outside of some specialised application that composes them into something useful I guess.

Cydhra · 2024-09-17T20:03:52Z

Yeah, that actually sounds reasonable, and gets rid of a lot of inelegant decision-making. And the raw-parts are pseudo-stable anyway, since I don't want to break serde compatibility between versions.

This also means that it's probably possible to implement the necessary parts of this issue without a major version bump, theyll just break alongside everything else when I do one.

somethingelseentirely · 2024-09-18T08:16:40Z

Maybe.
It requires changes to the collections that own the bytes holding the data/blocks from a Vec to something with shared ownership of the (read-only) bytes. So:

pub struct RsVec {
    data: Vec<u64>,
    len: usize,
    blocks: Vec<BlockDescriptor>,
    super_blocks: Vec<SuperBlockDescriptor>,
    select_blocks: Vec<SelectSuperBlockDescriptor>,
    rank0: usize,
    rank1: usize,
}

Becomes

pub struct RsVec {
  len: usize,
  pub(crate) rank0: usize,
  pub(crate) rank1: usize,
  data: PackedSlice<u64>,
  blocks: PackedSlice<BlockDescriptor>,
  super_blocks: PackedSlice<SuperBlockDescriptor>,
  select_blocks: PackedSlice<SelectSuperBlockDescriptor>,
}

Or any other ByteOwningThing instead of PackedSlice that you prefer (I'm willing to write the missing docs and bump the anybytes crate to 1.0.0 if you only want non-pre-release dependencies).

I think it should serialise similarly with Serde, but it would probably require a custom (de)serializer implementation that creates a byte owning Vec and casts that to a PackedSlice.

somethingelseentirely · 2024-09-18T09:03:23Z

Yeah I think that's fair.

I did some preliminary benchmarks, and saw no impact on performance (the PackedSlice is essentially just a regular fat pointer, so not any different from a Vec and the safe transmute is also zero cost*), but the #[repr(C)] on blocks alone might make a difference in optimisability, so I can't guarantee that it won't have an impact in every case.

One interesting side effect, wrt. performance, of using the PackedSlices is that cloning RsVecs actually becomes super duper cheap, but I'm not sure if that has any merit 😆

I also just realized that there is a third option of me implementing serde for PackedSlice, I think serde doesn't distinguish between sequence types in the serialized output, so it might be that such an implementation was automatically backwards compatible with the current Vec based approach.

I've created a draft pull request to make discussions about the code easier at #14.
Feel free to take it for a benchmark spin. I currently don't have access to a large x86, only an aarch64.

ARM SIMD support is also on my wishlist but that's a different issue 😉

Edit: Just realized that it does perform an alignment check on the cast, so it's not entirely free, but still very cheap. Annoyingly I check that invariant of the PackedSlice at creation time so subsequent dereferences could just use an unsafe path relying on that invariant, but ZeroCopy doesn't expose such an interface...

Cydhra · 2024-09-18T14:39:56Z

Annoyingly I check that invariant of the PackedSlice at creation time so subsequent dereferences could just use an unsafe path relying on that invariant

Wouldn't that induce !Unpin?

somethingelseentirely · 2024-09-18T15:12:30Z

Yes, definitely, but the way anybytes::Bytes works, is that it takes ownership of whatever implements unsafe trait ByteOwner in an Arc. It then calls ByteOwner::as_bytes() -> &[u8] on the thing, with the required invariant that that the returned address must be stable (Pin in a sense, but requiring that as a Bound would only serve as documentation as [u8] itself is trivially Unpin, but it's the reason why the trait is unsafe).

Since PackedSlice is immutable, because Bytes is immutable, we know that that for example an encapsulated Vec<T> where T: AsBytes + FromBytes, will never grow, and thus never change it's reference.

somethingelseentirely · 2024-09-18T21:58:00Z

I just stumbled upon sux by chance, and their epserde approach is pretty much what we've also decided to use, so there's some prior art there.

I've look at their code and it's less flexible and more cumbersomethan what we have imho, so it's also nice that we can make some improvements in this space.

From friday on I'll be in Sicily for two weeks to harvest some olives, so I might find the time to push this a bit 😁

Cydhra · 2024-12-15T19:54:11Z

So, since I've been doing some work here anyway, whats the status on this?

I tend to agree with your last approach that we only expose some constructors and add some tags on the structures (guarded by a crate feature obviously), and leave the details of serialization to downstream crates.

somethingelseentirely · 2024-12-15T21:39:40Z

It was relatively easy to replace the Vecs with a read-only zerocopy-bytes based one: https://github.com/triblespace/vers/tree/zerocopy

I just never got around to write an de/serialization example itself (the epsilon part of the epsilon serialization), because integrating vers into my storage layer was a bit harder than anticipated and then there were other things coming up at work.

But the tests pass.

The only thing missing are constructors that allow the creation from the byte types, I'll try to add those.

Cydhra added enhancement New feature or request good first issue Good for newcomers labels May 19, 2024

This comment was marked as resolved.

Sign in to view

somethingelseentirely mentioned this issue May 20, 2024

WaveletMatrix #4

Closed

This comment was marked as resolved.

Sign in to view

Cydhra removed the good first issue Good for newcomers label May 20, 2024

This comment was marked as resolved.

Sign in to view

Cydhra mentioned this issue Jul 13, 2024

Breaking Changes Wishlist #9

Open

10 tasks

This comment was marked as off-topic.

Sign in to view

somethingelseentirely closed this as completed Aug 31, 2024

Cydhra reopened this Aug 31, 2024

This comment was marked as off-topic.

Sign in to view

somethingelseentirely mentioned this issue Sep 1, 2024

fix small typo in the SelectSuperBlockDescriptor documentation #13

Merged

Zero-Copy Serialization/Deserialization #5

Zero-Copy Serialization/Deserialization #5

Comments

somethingelseentirely commented May 19, 2024 • edited Loading

Cydhra commented May 19, 2024

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

somethingelseentirely commented May 20, 2024 • edited Loading

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Cydhra commented May 21, 2024

somethingelseentirely commented May 22, 2024

somethingelseentirely commented Aug 30, 2024

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

somethingelseentirely commented Sep 1, 2024

Cydhra commented Sep 1, 2024

somethingelseentirely commented Sep 1, 2024

This comment was marked as resolved.

This comment was marked as resolved.

Cydhra commented Sep 2, 2024

somethingelseentirely commented Sep 2, 2024

Cydhra commented Sep 2, 2024

somethingelseentirely commented Sep 10, 2024 • edited Loading

Cydhra commented Sep 15, 2024

somethingelseentirely commented Sep 15, 2024

Cydhra commented Sep 17, 2024

somethingelseentirely commented Sep 18, 2024

This comment was marked as resolved.

This comment was marked as resolved.

somethingelseentirely commented Sep 18, 2024 • edited Loading

Cydhra commented Sep 18, 2024 • edited Loading

somethingelseentirely commented Sep 18, 2024 • edited Loading

somethingelseentirely commented Sep 18, 2024 • edited Loading

This comment was marked as off-topic.

Cydhra commented Dec 15, 2024

somethingelseentirely commented Dec 15, 2024 • edited Loading

somethingelseentirely commented May 19, 2024 •

edited

Loading

somethingelseentirely commented May 20, 2024 •

edited

Loading

somethingelseentirely commented Sep 10, 2024 •

edited

Loading

somethingelseentirely commented Sep 18, 2024 •

edited

Loading

Cydhra commented Sep 18, 2024 •

edited

Loading

somethingelseentirely commented Sep 18, 2024 •

edited

Loading

somethingelseentirely commented Sep 18, 2024 •

edited

Loading

somethingelseentirely commented Dec 15, 2024 •

edited

Loading