-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some documentation on Multi-GPU support #2941
Conversation
How to check if the "P2P DMA access" is available? Using nvidia-smi? |
'nvidia-smi topo -m' will show you the connectivity matrix. You should be able to to P2P on most single socket systems. It's the multi-socket system where you will run into problems as P2P DMAs can't go across QPI on Intel systems. |
There are four symbols:
which ones are good for P2P? |
Those that start with P should be okay in the case above (352 drivers). Basically you want things connected via PCIe bridges or switches. Do note that each bridge can increase latency and reduce BW. There is no general tool available to show you all the connections in a usable way. lspci -tv gets pretty close to show the PCIe bridge topology. For example, from a DIGITS DevBox:
ID 17c2 are the computing devices in my case (the other is the HDMI audio). It shows there is a PLX bridge connecting 2 TitanX's to an x16 link to the CPU. |
|
||
Current implementation has a "soft" assumption that the devices being used are homogeneous. In practice, any devices of the same general class should work together, but performance and total size is limited by the smallest device being used. e.g. if you combine a TitanX and a GTX980, peformance will be limited by the 980. Mixing vastly different levels of boards, e.g. Kepler and Fermi, is not supported. | ||
|
||
"nvidia-smi topo -m" will show you the connectivity matrix. You can do P2P through PCIe bridges, but not across socket level links at this time, e.g. across CPU sockets on a multi-socket motherboard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please give an example "nvidia-smi topo -m" output and how can users understand the basics of their topology. Many Caffe users may be unaware of that and we might see some issues here and there about the topology. I believe, one or two more lines would be enough.
Thanks. Very helpful! |
Add some documentation on Multi-GPU support
Add initial documentation on Multi-GPU support.