OpenMP的生命游戏

我做了一个生命游戏的顺序版本，但现在我需要使用OpenMP创建一个并行版本的代码，但我遇到了一些问题。如果有人能帮助我，那将会非常好。 THKS。这里是我的顺序代码：OpenMP的生命游戏

// Swapping the two grids 
#define SWAP_BOARDS(b1, b2) do { \ 
char* temp = b1; \ 
b1 = b2; \ 
b2 = temp; \ 
} while(0) 

// Simplifying access to grid elements 
    #define BOARD(G, X, Y) ((G)[NC*(X)+(Y)]) 

char* sequential_game_of_life (char* outgrid, char* ingrid, 
     const int nrows, const int ncols, const int gens_max) { 

    const int NC = ncols; 
    int curgen, i, j; 

for (curgen = 0; curgen < gens_max; curgen++) 
    { 

    for (i = 0; i < nrows; i++) 
{ 
    for (j = 0; j < ncols; j++) 
    { 
     const int inorth = mod (i-1, nrows); 
     const int isouth = mod (i+1, nrows); 
     const int jwest = mod (j-1, ncols); 
     const int jeast = mod (j+1, ncols); 

     const char neighbor_count = 
    BOARD (ingrid, inorth, jwest) + 
    BOARD (ingrid, inorth, j) + 
    BOARD (ingrid, inorth, jeast) + 
    BOARD (ingrid, i, jwest) + 
    BOARD (ingrid, i, jeast) + 
    BOARD (ingrid, isouth, jwest) + 
    BOARD (ingrid, isouth, j) + 
    BOARD (ingrid, isouth, jeast); 

     BOARD(outgrid, i, j) = alivep (neighbor_count, BOARD (ingrid, i, j)); 
    } 
} 
    SWAP_BOARDS(outgrid, ingrid); 
} 
    return outgrid; 
}

我知道，我必须为平行的那些3，但我不能看到如何做到这一点。

来源

2012-12-12 tsukanomon

那么问题是什么？是不是像“我怎样才能让3'for'循环与OpenMP并行运行？”或类似的东西？ – netcoder

我觉得外环不能并行，因为输入到每一代前代，所以它有一个顺序式（至少你不能用细微的变化做到这一点！）

在的情况下，嵌套循环遍历矩阵或类似的东西，我喜欢从0到ncol*nrow（在你的情况）运行一个循环，并从循环索引中找到i和j。

这样的：

// because you are running a parallel codes multiple times in a loop, 
// it would be better to make the thread swarm first and schedule the 
// tasks in each loop iteration, to avoid multiple creation and destruction 
// of working threads 
#pragma omp parallel 
for (curgen = 0; curgen < gens_max; curgen++) 
{ 
    #pragma omp for 
    for (t = 0; t < nrows*ncols; t++) 
    { 
     int i = t/ncols; 
     int j = t % ncols; 
     const int inorth = mod (i-1, nrows); 
     const int isouth = mod (i+1, nrows); 
     const int jwest = mod (j-1, ncols); 
     const int jeast = mod (j+1, ncols); 

     const char neighbor_count = 
      BOARD (ingrid, inorth, jwest) + 
      BOARD (ingrid, inorth, j) + 
      BOARD (ingrid, inorth, jeast) + 
      BOARD (ingrid, i, jwest) + 
      BOARD (ingrid, i, jeast) + 
      BOARD (ingrid, isouth, jwest) + 
      BOARD (ingrid, isouth, j) + 
      BOARD (ingrid, isouth, jeast); 

     BOARD(outgrid, i, j) = alivep (neighbor_count, BOARD (ingrid, i, j)); 
    } 
    SWAP_BOARDS(outgrid, ingrid); 
}

我跑在我的笔记本电脑上的1000×1000矩阵双核2.53 GHz的CPU超过1000代这个代码，并获得69％的速度了。

来源

2012-12-12 19:05:13 saeedn

非常好，但我在这里有一些疑问，实际上是两个：第一：我不得不定义一个类似#pragma omp parallel num_threads（NTHREADS）private（t）的并行区域吗？第二：当我用你的建议运行算法时，使用1个线程比使用4更快。任何猜测为什么会发生这种情况？ – tsukanomon

'#pragma omp parallel for'是'#pragma omp parallel'的快捷方式，然后是'#pragma omp for'。您也可以在其中指定其他选项，如：'#pragma omp parallel for num_threads（N）'。为加快速度，请检查更新的答案 – saeedn

您必须给予'curgen'私人待遇。当有'collapse（n）'OpenMP指令时，手动循环崩溃是多余的。 –

OpenMP的生命游戏

回答

相关问题