2013-05-08 18 views
0

我试图运行使用OpenMP奇怪的输出,为Multi2sim V4.0.1

一个简单的程序简单OpenMP程序的程序如下

#include <iostream> 
#include <fstream> 
#include <vector> 
#include <omp.h> 
#include <algorithm> 
#include <math.h> 
#include <map> 
#include <string> 
#include <ctime> 
using namespace std; 

#define NUM 10 



void openMP() 
{ 
    omp_set_num_threads(1); 
    int sum =0; 
    #pragma omp parallel for shared(sum) 
    { 
     for (int i=0;i<100;i++) 
     { 
      sum++; 
     } 
    } 
    cout<<"sum = "<<sum<<endl; 

} 
int main() 
{ 
    cout<<"Open MP \n"; 
    openMP(); 
return 0; 
} 

现在,当我编译使用

g++ test.cpp -fopenmp -o test

和Ubuntu终端上运行它

./test

输出是正确的 - 我认为 - 如下

Open MP 
sum = 100 

但是当我尝试使用Multi2sim使用这2个文件,我被我的老师

多核配置给运行它:

[ General ] 
Cores = 4 
Threads = 1 

多核-MEM-配置:

[CacheGeometry geo-l1] 
Sets = 256 
Assoc = 2 
BlockSize = 64 
Latency = 2 
Policy = LRU 
Ports = 2 

[CacheGeometry geo-l2] 
Sets = 512 
Assoc = 4 
BlockSize = 64 
Latency = 20 
Policy = LRU 
Ports = 4 

[Module mod-l1-0] 
Type = Cache 
Geometry = geo-l1 
LowNetwork = net-l1-l2 
LowModules = mod-l2 

[Module mod-l1-1] 
Type = Cache 
Geometry = geo-l1 
LowNetwork = net-l1-l2 
LowModules = mod-l2 

[Module mod-l2] 
Type = Cache 
Geometry = geo-l2 
HighNetwork = net-l1-l2 
LowNetwork = net-l2-mm 
LowModules = mod-mm 

[Module mod-mm] 
Type = MainMemory 
BlockSize = 256 
Latency = 200 
HighNetwork = net-l2-mm 

[Network net-l2-mm] 
DefaultInputBufferSize = 1024 
DefaultOutputBufferSize = 1024 
DefaultBandwidth = 256 

[Network net-l1-l2] 
DefaultInputBufferSize = 1024 
DefaultOutputBufferSize = 1024 
DefaultBandwidth = 256 

[Entry core-0] 
Arch = x86 
Core = 0 
Thread = 0 
DataModule = mod-l1-0 
InstModule = mod-l1-0 

[Entry core-1] 
Arch = x86 
Core = 1 
Thread = 0 
DataModule = mod-l1-0 
InstModule = mod-l1-0 

[Entry core-2] 
Arch = x86 
Core = 2 
Thread = 0 
DataModule = mod-l1-0 
InstModule = mod-l1-0 

[Entry core-3] 
Arch = x86 
Core = 3 
Thread = 0 
DataModule = mod-l1-0 
InstModule = mod-l1-0 

然后在Ubuntu的终端使用此指令

m2s --x86-config multicore-config.txt --mem-config multicore-mem-config.txt --x86-sim detailed test 

我得到的输出

; Multi2Sim 4.0.1 - A Simulation Framework for CPU-GPU Heterogeneous Computing 
; Please use command 'm2s --help' for a list of command-line options. 
; Last compilation: May 8 2013 10:01:31 

Open MP 
sum = 83 

; 
; Simulation Statistics Summary 
; 

[ General ] 
Time = 53.17 
SimEnd = ContextsFinished 
Cycles = 3691870 

[ x86 ] 
SimType = Detailed 
Time = 53.15 
Contexts = 4 
Memory = 37056512 
EmulatedInstructions = 3292450 
EmulatedInstructionsPerSecond = 61943 
Cycles = 3691558 
CyclesPerSecond = 69452 
FastForwardInstructions = 0 
CommittedInstructions = 2081157 
CommittedInstructionsPerCycle = 0.5638 
CommittedMicroInstructions = 3113721 
CommittedMicroInstructionsPerCycle = 0.8435 
BranchPredictionAccuracy = 0.9375 

为什么在Multi2sim 83的输出,而在正常运行的结果是100

此外,为什么需要花费很多时间在Multi2Sim上运行?

任何帮助,将不胜感激。

回答

1

我真的不知道m2s,但罪魁祸首是它可能是这样的:

#pragma omp parallel for shared(sum) 
    { 
     for (int i=0;i<100;i++) 
     { 
      sum++; // Concurrent access to a shared variable!!! 
     } 
    } 

在第一个测试的事实,你明确的线程数设置为1

omp_set_num_threads(1); 

将您从竞赛状态中解救出来。我建议尝试:

#pragma omp parallel for shared(sum) reduction(+:sum) 
for (int i=0;i<100;i++) { 
      sum++; 
}  

看看你是否可以获得所需的行为。

+0

不,它没有帮助我解决问题。就我所知,m2s为该程序在仿真器上运行创建了一个环境,并且它应该为具有4个内核和一个线程的处理器模拟此代码。 你能告诉我更多关于“减少”吗? – AerRayes 2013-05-08 19:40:51

+0

再次测试一些事实后,事实上还原了。谢谢。 – AerRayes 2013-05-08 22:33:47