Come up with a strategy for graph, node, edge properties #4

daniel-j-h · 2023-01-13T20:06:15Z

At the moment we compactly store a compressed sparse row graph - we do not store any global graph, or node, or edge properties with it. The thinking was we want to get an MVP out asap and we don't know yet how an interface for these properties should look like and if we should even store properties in the graph format at all.

Use cases for properties include e.g. graph embeddings, node embeddings, edge embeddings, where we need to store fixed size tensors per graph, node, or edge, respectively.

Two tasks here

decide if we should store properties with the graph, and which ones (e.g. only int/float tensors?)
come up with an interface for it and how we store these properties

daniel-j-h · 2023-01-15T18:48:31Z

If you scroll down, here's a good starting point for tensors in protobuf

https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html

copying the example for float32 tensors

 // A sparse or dense rank-R tensor that stores data as doubles (float64).
 message Float32Tensor   {
     // Each value in the vector. If keys is empty, this is treated as a
     // dense vector.
     repeated float values = 1 [packed = true];

     // If key is not empty, the vector is treated as sparse, with
     // each key specifying the location of the value in the sparse vector.
     repeated uint64 keys = 2 [packed = true];

     // An optional shape that allows the vector to represent a matrix.
     // For example, if shape = [ 10, 20 ], floor(keys[i] / 20) gives the row,
     // and keys[i] % 20 gives the column.
     // This also supports n-dimensonal tensors.
     // Note: If the tensor is sparse, you must specify this value.
     repeated uint64 shape = 3 [packed = true];
 }

daniel-j-h · 2023-01-15T18:50:29Z

Note that we might need boolean properties e.g. as in

here are all forward/reverse edges
here are all bidirectional edges
here are all train/validate nodes/edges

and for that we should think about storing a dense bitset (bytes) with n bits for n nodes/edges.

daniel-j-h mentioned this issue Jan 15, 2023

Support storing multiple graphs per file #7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Come up with a strategy for graph, node, edge properties #4

Come up with a strategy for graph, node, edge properties #4

daniel-j-h commented Jan 13, 2023

daniel-j-h commented Jan 15, 2023

daniel-j-h commented Jan 15, 2023

Come up with a strategy for graph, node, edge properties #4

Come up with a strategy for graph, node, edge properties #4

Comments

daniel-j-h commented Jan 13, 2023

daniel-j-h commented Jan 15, 2023

daniel-j-h commented Jan 15, 2023