Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some documentation on Multi-GPU support #2941

Merged
merged 2 commits into from
Sep 14, 2015

Conversation

thatguymike
Copy link
Contributor

Add initial documentation on Multi-GPU support.

@raingo
Copy link

raingo commented Aug 18, 2015

How to check if the "P2P DMA access" is available? Using nvidia-smi?

@thatguymike
Copy link
Contributor Author

'nvidia-smi topo -m' will show you the connectivity matrix. You should be able to to P2P on most single socket systems. It's the multi-socket system where you will run into problems as P2P DMAs can't go across QPI on Intel systems.

@raingo
Copy link

raingo commented Aug 18, 2015

There are four symbols:

  X   = Self
  SOC = Path traverses a socket-level link (e.g. QPI)
  PHB = Path traverses a PCIe host bridge
  PXB = Path traverses multiple PCIe internal switches
  PIX = Path traverses a PCIe internal switch

which ones are good for P2P?

@thatguymike
Copy link
Contributor Author

Those that start with P should be okay in the case above (352 drivers). Basically you want things connected via PCIe bridges or switches. Do note that each bridge can increase latency and reduce BW. There is no general tool available to show you all the connections in a usable way. lspci -tv gets pretty close to show the PCIe bridge topology.

For example, from a DIGITS DevBox:

        +-02.0-[07-0a]----00.0-[08-0a]--+-08.0-[0a]--+-00.0  NVIDIA Corporation Device 17c2
         |                               |            \-00.1  NVIDIA Corporation Device 0fb0
         |                               \-10.0-[09]--+-00.0  NVIDIA Corporation Device 17c2
         |                                            \-00.1  NVIDIA Corporation Device 0fb0
         +-03.0-[03-06]----00.0-[04-06]--+-08.0-[06]--+-00.0  NVIDIA Corporation Device 17c2
         |                               |            \-00.1  NVIDIA Corporation Device 0fb0
         |                               \-10.0-[05]--+-00.0  NVIDIA Corporation Device 17c2
         |                                            \-00.1  NVIDIA Corporation Device 0fb0

ID 17c2 are the computing devices in my case (the other is the HDMI audio). It shows there is a PLX bridge connecting 2 TitanX's to an x16 link to the CPU.


Current implementation has a "soft" assumption that the devices being used are homogeneous. In practice, any devices of the same general class should work together, but performance and total size is limited by the smallest device being used. e.g. if you combine a TitanX and a GTX980, peformance will be limited by the 980. Mixing vastly different levels of boards, e.g. Kepler and Fermi, is not supported.

"nvidia-smi topo -m" will show you the connectivity matrix. You can do P2P through PCIe bridges, but not across socket level links at this time, e.g. across CPU sockets on a multi-socket motherboard.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please give an example "nvidia-smi topo -m" output and how can users understand the basics of their topology. Many Caffe users may be unaware of that and we might see some issues here and there about the topology. I believe, one or two more lines would be enough.

@raingo
Copy link

raingo commented Aug 19, 2015

Thanks. Very helpful!

ronghanghu added a commit that referenced this pull request Sep 14, 2015
Add some documentation on Multi-GPU support
@ronghanghu ronghanghu merged commit 5b3ad4d into BVLC:master Sep 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants