Implement dense bitset strategy for a compact csr representation #3

daniel-j-h · 2023-01-13T19:26:33Z

With a graph

(0 -> 1)
(0 -> 2)
(1 -> 0)
(1 -> 2)
(2 -> 1)

the TinyGraphIO interchange file format stores the compressed sparse row representation

offsets: [0, 2, 4, 5]
targets: [1, 2, 0, 2, 1]

where edges(v) = targets[offsets[v]] - targets[offsets[v + 1]]

Varints of Deltas

For a compact representation we use the following techniques

The offsets array is monotonically increasing. We use delta encoding to transform the offsets array into small positive integers.
We store the offsets deltas as varints so that small deltas can be represented in as little space as a single byte.
We store the targets array as varints so that small targets can be represented in as little space as a single byte.

With delta encoded varints for offsets and varints for targets, the compressed sparse row graph can be compactly represented as

offsets = [0x0, 0x2, 0x2, 0x1]  # four bytes total
targets = [0x1, 0x2, 0x0, 0x2, 0x1]  # five bytes total

Dense Bitsets

An alternative is representing offsets as a dense bitset indicating when a new sub-range in targets starts

offsets = [1, 0, 1, 0, 1]  # 5 bits total, stored as 1 byte
targets = [1, 2, 0, 2, 1]  # 5 bytes total

This alternative representation is more space efficient for graphs with a relatively low number of edges compared to number of nodes.

Open Questions

Should we support both approaches and leave it to the user to decide? Or can we be smart and select the approach based on what the user graph looks like?

The text was updated successfully, but these errors were encountered:

daniel-j-h · 2024-06-13T21:26:30Z

With graph chunks #7 we could have a tag for dense/sparse encoding and take that decision on a graph chunk by chunk level.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement dense bitset strategy for a compact csr representation #3

Implement dense bitset strategy for a compact csr representation #3

daniel-j-h commented Jan 13, 2023

daniel-j-h commented Jun 13, 2024

Implement dense bitset strategy for a compact csr representation #3

Implement dense bitset strategy for a compact csr representation #3

Comments

daniel-j-h commented Jan 13, 2023

Varints of Deltas

Dense Bitsets

Open Questions

daniel-j-h commented Jun 13, 2024