You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(colptr, row): CSC/CSR representation of the graph
input_node: The seed nodes for which to sample neighbors
num_neighbors: A list of neighbors to sample in each layer
replace: Sample without or with replacement
directed: Whether sampled edges are directed or not. If not, we extract the full subgraph of sampled nodes.
It returns (re-labeled) row and col vectors of the sampled subgraph/adjacency matrix, as well as output_node_id and output_edge_id of the sampled nodes/edges to perform feature fetching in a later stage.
On the other side, the sampling interface inside cugraph looks as follows:
The major difference seems to be that cugraph performs sampling for 1-hop, while PyG supports multi-hop sampling (which can be fixed easily by just calling the cugraph routine multiple times) [to be confirmed by @pyg-team/nvidia-team].
Roadmap
For integrating GPU-based sampling inside PyG, we thus need to:
Integrate torch-sparse neighorhood sampling interface in pyg-lib
cugraph as a dependency inside pyg-lib
Call the cugraph-based sampling routine inside the GPU-based dispatcher
Integrate changes on PyG side
The text was updated successfully, but these errors were encountered:
🚀 The feature, motivation and pitch
GPU-based neighborhood sampling can accelerate mini-batch creation for graphs that fit into GPU memory.
Currently, the (solely) CPU-based sampling interface inside PyG looks as follows:
The PyG routine expects:
(colptr, row)
: CSC/CSR representation of the graphinput_node
: The seed nodes for which to sample neighborsnum_neighbors
: A list of neighbors to sample in each layerreplace
: Sample without or with replacementdirected
: Whether sampled edges are directed or not. If not, we extract the full subgraph of sampled nodes.It returns (re-labeled)
row
andcol
vectors of the sampled subgraph/adjacency matrix, as well asoutput_node_id
andoutput_edge_id
of the sampled nodes/edges to perform feature fetching in a later stage.On the other side, the sampling interface inside
cugraph
looks as follows:The major difference seems to be that
cugraph
performs sampling for 1-hop, while PyG supports multi-hop sampling (which can be fixed easily by just calling thecugraph
routine multiple times) [to be confirmed by @pyg-team/nvidia-team].Roadmap
For integrating GPU-based sampling inside PyG, we thus need to:
torch-sparse
neighorhood sampling interface inpyg-lib
cugraph
as a dependency insidepyg-lib
cugraph
-based sampling routine inside the GPU-based dispatcherThe text was updated successfully, but these errors were encountered: