在MPI中按元素明智地收集和收集元素

-1

计算后，使用笛卡尔拓扑将矩阵与矢量相乘。我用他们的等级和向量得到以下过程。在MPI中按元素明智地收集和收集元素

P0 (process with rank = 0) =[2 , 9]. 
P1 (process with rank = 1) =[2 , 3] 
P2 (process with rank = 2) =[1 , 9] 
P3 (process with rank = 3) =[4 , 6].

现在。我需要分别总结偶数等级过程的奇数那些的元素和，如下所示：

temp1目录= [3,18]
TEMP2 = [6,9]

，然后，收集结果在不同的载体，就像这样：

result = [3 , 18 , 6 , 9]

我attampt做到这一点是使用MPI_Reduce然后MPI_Gather这样的：

// Previous code 
double* temp1 , *temp2; 
    if(myrank %2 == 0){ 
    BOOLEAN flag = Allocate_vector(&temp1 ,local_m); // function to allocate space for vectors 
    MPI_Reduce(local_y, temp1, local_n, MPI_DOUBLE, MPI_SUM, 0 , comm); 
    MPI_Gather(temp1, local_n, MPI_DOUBLE, gResult, local_n, MPI_DOUBLE,0, comm); 
     free(temp1); 
     } 
    else{ 
     Allocate_vector(&temp2 ,local_m); 
     MPI_Reduce(local_y, temp2, local_n , MPI_DOUBLE, MPI_SUM, 0 , comm); 
     MPI_Gather(temp2, local_n, MPI_DOUBLE, gResult, local_n, MPI_DOUBLE, 0,comm); 
     free(temp2); 
     }

但答案不correct.It seemd该代码求和偶数和奇数处理togather的所有元素，然后给出一个分段错误： Wrong_result = [21 15 0 0] 和此错误

** Error in ./test': double free or corruption (fasttop): 0x00000000013c7510 *** *** Error in ./test': double free or corruption (fasttop): 0x0000000001605b60 ***

来源

2017-04-03 Rational Rose

请参阅[如何创建最小，完整和可验证的示例]（https://stackoverflow.com/help/mcve） – Arash

它不会像你试图这样做。为了减少一部分流程的元素，你必须为它们创建一个子通信器。在你的情况下，奇数和偶数进程共享相同的comm，因此这些操作不是在两个独立的进程组上，而是在联合组上。

您应该使用MPI_Comm_split进行分割，利用这两个新subcommunicators进行还原，终于有秩0的每个subcommunicator（让我们调用这些领导）参加聚集了只包含这两个另一个subcommunicator ：

// Make sure rank is set accordingly 

MPI_Comm_rank(comm, &rank); 

// Split even and odd ranks in separate subcommunicators 

MPI_Comm subcomm; 
MPI_Comm_split(comm, rank % 2, 0, &subcomm); 

// Perform the reduction in each separate group 

double *temp; 
Allocate_vector(&temp, local_n); 
MPI_Reduce(local_y, temp, local_n , MPI_DOUBLE, MPI_SUM, 0, subcomm); 

// Find out our rank in subcomm 

int subrank; 
MPI_Comm_rank(subcomm, &subrank); 

// At this point, we no longer need subcomm. Free it and reuse the variable. 

MPI_Comm_free(&subcomm); 

// Separate both group leaders (rank 0) into their own subcommunicator 

MPI_Comm_split(comm, subrank == 0 ? 0 : MPI_UNDEFINED, 0, &subcomm); 
if (subcomm != MPI_COMM_NULL) { 
    MPI_Gather(temp, local_n, MPI_DOUBLE, gResult, local_n, MPI_DOUBLE, 0, subcomm); 
    MPI_Comm_free(&subcomm); 
} 

// Free resources 

free(temp);

其结果将是在等级0的gResult在后者subcomm，这恰好是秩0 comm的，因为进行的分割的方式。

不像预期的那么简单，我想，但那是在MPI中进行方便的集体操作的代价。

在一个侧面节点，在代码中所示您正在分配temp1和temp2是长度local_m的，而在所有集体呼叫的长度被指定为local_n。如果它发生local_n > local_m，那么会发生堆损坏。

来源

2017-04-03 21:57:21

谢谢。我使用了您的建议。该代码成功运行，然后再当我试图再次运行它仍然给出了答案，但这个错误：致命错误PMPI_Comm_free：在PMPI_Comm_free致命错误：无效的沟通，错误堆栈： PMPI_Comm_free（143）：MPI_Comm_free（COMM = 0x7ffe22d63200 ）失败 PMPI_Comm_free（93）：空通信器 PMPI_Comm_free中的致命错误：无效的通信器，错误堆栈：PMPI_Comm_free（143）：MPI_Comm_free（comm = 0x7ffca17bd180）失败PMPI_Comm_free（93）。：Null communicator –

它似乎无法创建第一行的通信器，因为它返回MPI_COMM_NULL –

啊，当然。我的错。在最后释放它之前，进程应该检查它们是否是'subcomm'的一部分。固定。 –

在MPI中按元素明智地收集和收集元素

回答

相关问题