Add some documentation on Multi-GPU support #2941

thatguymike · 2015-08-18T20:42:13Z

Add initial documentation on Multi-GPU support.

raingo · 2015-08-18T21:54:16Z

How to check if the "P2P DMA access" is available? Using nvidia-smi?

thatguymike · 2015-08-18T22:25:47Z

'nvidia-smi topo -m' will show you the connectivity matrix. You should be able to to P2P on most single socket systems. It's the multi-socket system where you will run into problems as P2P DMAs can't go across QPI on Intel systems.

raingo · 2015-08-18T22:50:17Z

There are four symbols:

  X   = Self
  SOC = Path traverses a socket-level link (e.g. QPI)
  PHB = Path traverses a PCIe host bridge
  PXB = Path traverses multiple PCIe internal switches
  PIX = Path traverses a PCIe internal switch

which ones are good for P2P?

thatguymike · 2015-08-18T22:58:52Z

Those that start with P should be okay in the case above (352 drivers). Basically you want things connected via PCIe bridges or switches. Do note that each bridge can increase latency and reduce BW. There is no general tool available to show you all the connections in a usable way. lspci -tv gets pretty close to show the PCIe bridge topology.

For example, from a DIGITS DevBox:

        +-02.0-[07-0a]----00.0-[08-0a]--+-08.0-[0a]--+-00.0  NVIDIA Corporation Device 17c2
         |                               |            \-00.1  NVIDIA Corporation Device 0fb0
         |                               \-10.0-[09]--+-00.0  NVIDIA Corporation Device 17c2
         |                                            \-00.1  NVIDIA Corporation Device 0fb0
         +-03.0-[03-06]----00.0-[04-06]--+-08.0-[06]--+-00.0  NVIDIA Corporation Device 17c2
         |                               |            \-00.1  NVIDIA Corporation Device 0fb0
         |                               \-10.0-[05]--+-00.0  NVIDIA Corporation Device 17c2
         |                                            \-00.1  NVIDIA Corporation Device 0fb0

ID 17c2 are the computing devices in my case (the other is the HDMI audio). It shows there is a PLX bridge connecting 2 TitanX's to an x16 link to the CPU.

cNikolaou · 2015-08-19T10:28:08Z

docs/multigpu.md

+
+Current implementation has a "soft" assumption that the devices being used are homogeneous.  In practice, any devices of the same general class should work together, but performance and total size is limited by the smallest device being used.  e.g. if you combine a TitanX and a GTX980, peformance will be limited by the 980.  Mixing vastly different levels of boards, e.g. Kepler and Fermi, is not supported.
+
+"nvidia-smi topo -m" will show you the connectivity matrix.  You can do P2P through PCIe bridges, but not across socket level links at this time, e.g. across CPU sockets on a multi-socket motherboard.


Could you please give an example "nvidia-smi topo -m" output and how can users understand the basics of their topology. Many Caffe users may be unaware of that and we might see some issues here and there about the topology. I believe, one or two more lines would be enough.

raingo · 2015-08-19T13:49:22Z

Thanks. Very helpful!

Add some documentation on Multi-GPU support

Add some documentation on Multi-GPU support

7453bbf

longjon added documentation ready for review labels Aug 18, 2015

Add information about how to get GPU topology from nvidia-smi

26a9880

cNikolaou reviewed Aug 19, 2015
View reviewed changes

ronghanghu added a commit that referenced this pull request Sep 14, 2015

Merge pull request #2941 from thatguymike/MultiGPUDocs

5b3ad4d

Add some documentation on Multi-GPU support

ronghanghu merged commit 5b3ad4d into BVLC:master Sep 14, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add some documentation on Multi-GPU support #2941

Add some documentation on Multi-GPU support #2941

thatguymike commented Aug 18, 2015

raingo commented Aug 18, 2015

thatguymike commented Aug 18, 2015

raingo commented Aug 18, 2015

thatguymike commented Aug 18, 2015

cNikolaou Aug 19, 2015

raingo commented Aug 19, 2015


		Current implementation has a "soft" assumption that the devices being used are homogeneous. In practice, any devices of the same general class should work together, but performance and total size is limited by the smallest device being used. e.g. if you combine a TitanX and a GTX980, peformance will be limited by the 980. Mixing vastly different levels of boards, e.g. Kepler and Fermi, is not supported.

		"nvidia-smi topo -m" will show you the connectivity matrix. You can do P2P through PCIe bridges, but not across socket level links at this time, e.g. across CPU sockets on a multi-socket motherboard.

Add some documentation on Multi-GPU support #2941

Add some documentation on Multi-GPU support #2941

Conversation

thatguymike commented Aug 18, 2015

raingo commented Aug 18, 2015

thatguymike commented Aug 18, 2015

raingo commented Aug 18, 2015

thatguymike commented Aug 18, 2015

cNikolaou Aug 19, 2015

Choose a reason for hiding this comment

raingo commented Aug 19, 2015