2017-04-14 10 views
1

说我有n个过程:如何获得MPI中的所有等级来发送一个值为0,然后在所有等级上进行阻止接收?

他们做一个计算,然后发送结果排名为0。这是我希望发生的:

等级0将等待,直到它从结果所有的队伍,然后加起来

我该怎么做?另外,我想避免以下内容:

例如, 4个进程P0,P1,P2,P3,

P1 -> P0 
P2 -> P0 
P3 -> P0 

与此同时P1已完成其计算等P1-> P0再次发生。

我希望P0只在一个周期内完成3个过程的添加,然后再为下一个周期做3个过程

有人可以建议一个MPI功能来做到这一点?我知道MPI_Gather,但我不知道它阻挡

我想过了这个:

#include <mpi.h> 
#include <stdio.h> 

int main() 
{ 
int pross, rank,p_count = 0; 
int count = 10; 
MPI_Init(&argc,&argv); 
MPI_Comm_size(MPI_COMM_WORLD,&pross); 
MPI_Comm_rank(MPI_COMM_WORLD,&rank); 

int * num = malloc((pross-1)*sizeof(int)); 

     if(rank !=0) 
     { 
      MPI_Send(&count,1,MPI_INT,0,1,MPI_COMM_WORLD); 
     } 
     else 
     {    
      MPI_Gather(&count, 1,MPI_INT,num, 1, MPI_INT, 0,MPI_COMM_WORLD); 
      for(ii = 0; ii < pross-1;ii++){printf("\n NUM %d \n",num[ii]); p_count += num[ii]; } 
} 
MPI_Finalize(); 
} 

我收到提示:

*** Process received signal *** 
    Signal: Segmentation fault (11) 
    Signal code: Address not mapped (1) 
    Failing at address: (nil) 
    [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11630)[0x7fb3e3bc3630] 
    [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x90925)[0x7fb3e387b925] 
    [ 2] /usr/lib/libopen-pal.so.13(+0x30177)[0x7fb3e3302177] 
    [ 3] /usr/lib/libmpi.so.12(ompi_datatype_sndrcv+0x54c)[0x7fb3e3e1e3ec] 
    [ 4] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_gather_intra_basic_linear+0x143)[0x7fb3d53d9063] 
    [ 5] /usr/lib/libmpi.so.12(PMPI_Gather+0x1ba)[0x7fb3e3e29a3a] 
    [ 6] sosuks(+0xe83)[0x55ee72119e83] 
    [ 7] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fb3e380b3f1] 
    [ 8] sosuks(+0xb5a)[0x55ee72119b5a] 
    *** End of error message *** 

而且,我想:

#include <mpi.h> 
#include <stdio.h> 

int main() 
{ 
int pross, rank,p_count = 0; 
int count = 10; 
MPI_Init(&argc,&argv); 
MPI_Comm_size(MPI_COMM_WORLD,&pross); 
MPI_Comm_rank(MPI_COMM_WORLD,&rank); 

int * num = malloc((pross-1)*sizeof(int)); 

     if(rank !=0) 
     { 
      MPI_Send(&count,1,MPI_INT,0,1,MPI_COMM_WORLD); 
     } 
     else 
     {    
      MPI_Gather(&count, 1,MPI_INT,num, 1, MPI_INT, 0,MPI_COMM_WORLD); 
      for(ii = 0; ii < pross-1;ii++){printf("\n NUM %d \n",num[ii]); p_count += num[ii]; } 
} 
MPI_Finalize(); 
} 

我得到错误的位置:

*** Process received signal *** 
    Signal: Segmentation fault (11) 
    Signal code: Address not mapped (1) 
    Failing at address: 0x560600000002 
    [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11630)[0x7fefc8c11630] 
    [ 1] mdscisuks(+0xeac)[0x5606c1263eac] 
    [ 2] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fefc88593f1] 
    [ 3] mdscisuks(+0xb4a)[0x5606c1263b4a] 
    *** End of error message *** 

对于第二次尝试,这里需要注意的是send和recv是成功的,但根由于某些原因只能从队列中接收2条消息。看到的分段错误是由于在num中只有两个元素,我不明白为什么只接收两次

我打电话的代码

mpiexec -n 6 ./sosuks 

有人能告诉我去实现我的想法更好的/正确的方法是什么?

UPDATE:

除了回答以下,我发现我在执行错误上面,我想和大家分享:

#include <mpi.h> 
#include <stdio.h> 

int main() 
{ 
int pross, rank,p_count = 0; 
int count = 10; 
MPI_Init(&argc,&argv); 
MPI_Comm_size(MPI_COMM_WORLD,&pross); 
MPI_Comm_rank(MPI_COMM_WORLD,&rank); 
MPI_Status status; 
int * num = malloc((pross-1)*sizeof(int)); 

     if(rank !=0) 
     { 
      MPI_Send(&count,1,MPI_INT,0,1,MPI_COMM_WORLD); 
     } 
     else 
     { 
     int var,lick = 0; 
     for(lick = 1; lick < pross; u++) 
     { 
     int fetihs; 
     MPI_Recv(&fetihs,1,MPI_INT,lick,1,MPI_COMM_WORLD,&status);   

     var += fetihs; 
     } 
    // do things with var 
} 
MPI_Finalize(); 
} 
+2

如果将所有结果相加,那么您可能需要'MPI_Reduce',而不是'MPI_Gather'。 – Sneftel

+0

但是它会按照上面问题中概述的阻塞过程吗?我试图在所有过程达到某个特定点后才进行添加。在某种程度上,我试图“同步”来自该步骤的所有进程的结果。 – user26763

+0

你的描述不是很清楚,但它听起来像你想要每一轮的障碍。是的,会有障碍(没有合理的方法来减少)。 – Sneftel

回答

1

在你的情况下,作为Sneftel指出的那样,你需要MPI_Reduce。另外,在循环完成之前,您不需要显式同步。

#include <mpi.h> 
#include <stdio.h> 
#include <stdlib.h> 

int main(int argc, char* argv[]) 
{ 
    int pross, rank, p_count, count = 10; 

    MPI_Init(&argc,&argv); 
    MPI_Comm_size(MPI_COMM_WORLD, &pross); 
    MPI_Comm_rank(MPI_COMM_WORLD, &rank); 

    int* num = malloc((pross-1)*sizeof(int)); 

    // master does not send data to itself. 
    // only workers send data to master. 

    for (int i=0; i<3; ++i) 
    { 
     // to prove that no further sync is needed. 
     // you will get the same answer in each cycle. 
     p_count = 0; 

     if (rank == 0) 
     { 
      // this has not effect since master uses p_count for both 
      // send and receive buffers due to MPI_IN_PLACE. 
      count = 500; 

      MPI_Reduce(MPI_IN_PLACE, &p_count, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD); 
     } 
     else 
     { 
      // for slave p_count is irrelevant. 
      MPI_Reduce(&count, NULL, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD); 
     } 

     if (rank == 0) 
     { 
      printf("p_count = %i\n", p_count); 
     } 

     // slaves send their data to master before the cycle completes. 
     // no need for explicit sync such as MPI_Barrier. 
     // MPI_Barrier(MPI_COMM_WORLD); // no need. 
    } 

    MPI_Finalize(); 
} 

在从站上述count的代码在主减少到p_count。请注意0​​和两个MPI_Reduce调用。只需简单地设置count = 0并呼叫MPI_Reduce按照所有级别,您就可以获得相同的功能,而不需要MPI_IN_PLACE

for (int i=0; i<3; ++i) 
{ 
    p_count = 0;  
    if (rank == 0) count = 0; 

    MPI_Reduce(&count, &p_count, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);   
} 
相关问题