-
Notifications
You must be signed in to change notification settings - Fork 128
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
59 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
Efficient GPU Training | ||
====================== | ||
|
||
To train deep detectors efficiently, we usually use | ||
`CUDA <https://developer.nvidia.com/cuda-toolkit>`_ to accelerate | ||
the detector training on GPU. PyGOD provides ``gpu`` parameter for | ||
``DeepDetector``. During initialization, we can set ``gpu`` to the index | ||
of the GPU that is available. By default, ``gpu=-1``, which means train | ||
the detector on CPU. Here is an example of initialize ``DOMINANT`` with | ||
the first GPU (index of ``0``): | ||
|
||
.. code:: python | ||
DOMINANT(gpu=0) | ||
However, training deep detectors on large-scale graphs can be | ||
memory-intensive, especially on the detectors relying on adjacency | ||
matrix reconstruction. At this time, full batch training may result in | ||
out-of-memory (OOM) error. As such, we divide the large graph into | ||
minibatches, and train the detector on each batch. PyGOD provides | ||
``batch_size`` parameter for ``DeepDetector``, where users are able to | ||
adjust the size of each batch for various GPU memory. We recommend users | ||
setting ``batch_size`` to largest value that will not cause OOM. For | ||
instance, we would like to train ``DOMINANT`` with the batches of 64 | ||
nodes: | ||
|
||
.. code:: python | ||
DOMINANT(gpu=0, batch_size=64) | ||
Unlike other data modalities, the output of each node in graphs rely on | ||
its neighbors. In PyGOD implementation, we adopt the data loader | ||
``torch_geometric.loader.NeighborLoader`` in PyG to load both the center | ||
nodes and the neighbor nodes for minibatches. But the computation on | ||
neighbor nodes will lead to significant overhead and reduce the | ||
efficiency in the detector training. Thus, we neighbor sampling is | ||
crucial to reduce the overhead. PyGOD provides ``num_neigh`` parameter | ||
for ``DeepDetector``. We can specify how many neighbors are sampled at | ||
each layer of the detector. The default value of ``num_neigh`` is | ||
``-1``, indicating sample all neighbors of the center node. If we want | ||
to sample 5 neighbors at each layer, we can initialize ``DOMINANT`` | ||
like: | ||
|
||
.. code:: python | ||
DOMINANT(gpu=0, batch_size=64, num_neigh=5) | ||
We can also sample different number of neighbors at each layer by | ||
setting ``num_neigh`` as a list, but the length of the list has to match | ||
with the number of layers ``num_layers``: | ||
|
||
.. code:: python | ||
DOMINANT(gpu=0, batch_size=64, num_layers=2, num_neigh=[5, 3]) | ||
To learn more, read PyG's tutorial on | ||
`Scaling GNNs via Neighbor Sampling <https://pytorch-geometric.readthedocs.io/en/latest/tutorial/neighbor_loader.html>`_. |