Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Come up with a strategy for graph, node, edge properties #4

Open
2 tasks
daniel-j-h opened this issue Jan 13, 2023 · 2 comments
Open
2 tasks

Come up with a strategy for graph, node, edge properties #4

daniel-j-h opened this issue Jan 13, 2023 · 2 comments

Comments

@daniel-j-h
Copy link
Member

At the moment we compactly store a compressed sparse row graph - we do not store any global graph, or node, or edge properties with it. The thinking was we want to get an MVP out asap and we don't know yet how an interface for these properties should look like and if we should even store properties in the graph format at all.

Use cases for properties include e.g. graph embeddings, node embeddings, edge embeddings, where we need to store fixed size tensors per graph, node, or edge, respectively.

Two tasks here

  • decide if we should store properties with the graph, and which ones (e.g. only int/float tensors?)
  • come up with an interface for it and how we store these properties
@daniel-j-h
Copy link
Member Author

If you scroll down, here's a good starting point for tensors in protobuf

https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html

copying the example for float32 tensors

 // A sparse or dense rank-R tensor that stores data as doubles (float64).
 message Float32Tensor   {
     // Each value in the vector. If keys is empty, this is treated as a
     // dense vector.
     repeated float values = 1 [packed = true];

     // If key is not empty, the vector is treated as sparse, with
     // each key specifying the location of the value in the sparse vector.
     repeated uint64 keys = 2 [packed = true];

     // An optional shape that allows the vector to represent a matrix.
     // For example, if shape = [ 10, 20 ], floor(keys[i] / 20) gives the row,
     // and keys[i] % 20 gives the column.
     // This also supports n-dimensonal tensors.
     // Note: If the tensor is sparse, you must specify this value.
     repeated uint64 shape = 3 [packed = true];
 }

@daniel-j-h
Copy link
Member Author

Note that we might need boolean properties e.g. as in

  • here are all forward/reverse edges
  • here are all bidirectional edges
  • here are all train/validate nodes/edges

and for that we should think about storing a dense bitset (bytes) with n bits for n nodes/edges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant