运行多GPU CUDA示例时发生P2P内存访问失败（simpleP2P）

我试图解决在运行CUDA示例中包含的simpleP2P示例程序时发现的错误。错误如下：运行多GPU CUDA示例时发生P2P内存访问失败（simpleP2P）

$ ./simpleP2P 
[./simpleP2P] - Starting... 
Checking for multiple GPUs... 
CUDA-capable device count: 2 
> GPU0 = "  Tesla K20c" IS capable of Peer-to-Peer (P2P) 
> GPU1 = "  Tesla K20c" IS capable of Peer-to-Peer (P2P) 

Checking GPU(s) for support of peer to peer memory access... 
> Peer-to-Peer (P2P) access from Tesla K20c (GPU0) -> Tesla K20c (GPU1) : No 
> Peer-to-Peer (P2P) access from Tesla K20c (GPU1) -> Tesla K20c (GPU0) : No 
Two or more GPUs with SM 2.0 or higher capability are required for ./simpleP2P. 
Peer to Peer access is not available between GPU0 <-> GPU1, waiving test.

我使用的设备有以下几种：

$ lspci | grep NVIDIA 
03:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1) 
83:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)

从获得的信息涉及连接NVIDIA-SMI：

$ nvidia-smi topo -m 
    GPU0 GPU1 CPU Affinity 
GPU0  X SOC 0-5,12-17 
GPU1 SOC X 6-11,18-23 

Legend: 

    X = Self 
    SOC = Path traverses a socket-level link (e.g. QPI) 
    PHB = Path traverses a PCIe host bridge 
    PXB = Path traverses multiple PCIe internal switches 
    PIX = Path traverses a PCIe internal switch

最后更详细从lspci工具输出。

03:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1) 
     Subsystem: NVIDIA Corporation Device 0982 
     Flags: bus master, fast devsel, latency 0, IRQ 11 
     Memory at f9000000 (32-bit, non-prefetchable) 
     Memory at d0000000 (64-bit, prefetchable) 
     Memory at ce000000 (64-bit, prefetchable) 
     Capabilities: <access denied> 
     Kernel driver in use: nvidia 
     Kernel modules: nvidia_346, nouveau, nvidiafb 
... 
83:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1) 
     Subsystem: NVIDIA Corporation Device 0982 
     Flags: bus master, fast devsel, latency 0, IRQ 11 
     Memory at cc000000 (32-bit, non-prefetchable) 
     Memory at b0000000 (64-bit, prefetchable) 
     Memory at ae000000 (64-bit, prefetchable) 
     Capabilities: <access denied> 
     Kernel driver in use: nvidia 
     Kernel modules: nvidia_346, nouveau, nvidiafb

你们中的任何一个人都有一些信息可以帮助我排除故障或至少更好地理解问题出在哪里？像往常一样感谢阅读/帮助。 - 奥马

来源

2015-11-06 Omar Valerio

当GPU被经由套接字级链路（QPI用于基于Intel的系统）互连：

GPU0  X SOC 0-5,12-17 
GPU1 SOC X 6-11,18-23 
     ^^^

然后P2P交易是不可能的那些2个GPU之间。

参与P2P的GPU对它们有很多要求。其中之一是他们通常必须在同一个PCIE根联合体上。通过套接字级链路（例如QPI）连接的GPU位于两个不同的“套接字”上，即2个不同的CPU，因此它们属于两个不同的PCIE根联合体。

来源

2015-11-06 13:30:01

运行多GPU CUDA示例时发生P2P内存访问失败（simpleP2P）

回答

相关问题