如何比较并行代码(使用OpenMP)和串行代码的性能?我正在使用以下方法openmp代码(并行)与串行代码的性能分析
int arr[1000] = {1, 6, 1, 3, 1, 9, 7, 3, 2, 0, 5, 0, 8, 9, 8, 4, 4, 4, 0, 9, 6, 5, 9, 5, 9, 2, 5, 8, 6, 1, 0, 7, 7, 3, 2, 8, 3, 2, 3, 7, 2, 0, 7, 2, 9, 5, 8, 6, 2, 8, 5, 8, 5, 6, 3, 5, 8, 1, 3, 7, 2, 6, 6, 2, 1, 9, 0, 6, 1, 6, 3, 5, 6, 3, 0, 8, 0, 8, 4, 2, 7, 1, 0, 2, 7, 6, 9, 7, 7, 5, 4, 9, 3, 1, 1, 4, 2, 4, 1, 5, 2, 6, 0, 8, 9, 2, 6, 0, 1, 0, 2, 0, 3, 3, 4, 0, 1, 4, 8, 8, 1, 4, 9, 4, 7, 3, 8, 9, 9, 1, 4, 1, 8, 7, 9, 9, 9, 8, 9, 0, 0, 4, 2, 4, 9, 7, 6, 0, 3, 4, 8, 6, 1, 9, 0, 8, 2, 0, 8, 1, 2, 4, 2, 2, 1, 4, 1, 1, 4, 3, 3, 4, 9, 8, 0, 8, 7, 7, 8, 0, 3, 8, 8, 4, 7, 8, 5, 2, 0, 3, 3, 4, 9, 8, 6, 1, 4, 0, 4, 8, 5, 9, 4, 4, 7, 5, 2, 4, 2, 2, 6, 5, 2, 4, 2, 1, 4, 7, 3, 5, 2, 7, 9, 1, 7, 8, 4, 3, 0, 8, 1, 5, 8, 7, 1, 7, 2, 5, 2, 6, 9, 8, 2, 1, 5, 4, 2, 9, 1, 6, 6, 5, 5, 8, 6, 4, 6, 1, 7, 8, 1, 0, 3, 9, 7, 6, 7, 2, 1, 1, 8, 2, 9, 2, 3, 6, 8, 7, 8, 9, 5, 4, 4, 2, 2, 3, 6, 8, 4, 5, 6, 5, 7, 1, 7, 7, 9, 6, 9, 2, 7, 9, 4, 8, 2, 7, 5, 0, 7, 3, 2, 2, 9, 8, 7, 2, 3, 5, 2, 9, 1, 1, 5, 8, 4, 4, 5, 4, 0, 6, 6, 9, 8, 1, 7, 0, 0, 4, 2, 7, 9, 6, 2, 9, 7, 9, 1, 0, 4, 3, 0, 7, 6, 7, 8, 1, 1, 5, 5, 3, 4, 3, 2, 2, 4, 1, 2, 7, 6, 6, 4, 5, 3, 8, 4, 2, 9, 7, 2, 6, 3, 4, 3, 9, 1, 1, 0, 4, 9, 5, 7, 3, 9, 1, 5, 5, 5, 9, 2, 3, 5, 9, 8, 0, 9, 5, 2, 9, 4, 7, 5, 7, 1, 0, 7, 5, 4, 7, 9, 3, 5, 9, 8, 6, 2, 3, 1, 7, 2, 6, 0, 9, 7, 1, 2, 6, 8, 4, 5, 2, 3, 2, 2, 7, 3, 9, 2, 9, 6, 3, 2, 3, 2, 2, 9, 7, 5, 3, 4, 9, 9, 7, 8, 6, 0, 0, 4, 0, 7, 2, 4, 0, 4, 6, 9, 9, 5, 1, 0, 4, 5, 4, 7, 9, 6, 9, 6, 1, 2, 3, 0, 3, 2, 1, 1, 4, 1, 5, 4, 0, 7, 8, 3, 4, 5, 2, 5, 2, 6, 6, 6, 1, 0, 6, 2, 9, 5, 1, 0, 9, 6, 3, 4, 8, 4, 5, 2, 7, 2, 8, 8, 2, 6, 1, 6, 3, 5, 3, 6, 1, 1, 4, 4, 2, 0, 7, 1, 7, 0, 3, 8, 6, 6, 2, 6, 2, 7, 0, 0, 2, 8, 0, 4, 6, 3, 2, 0, 8, 5, 8, 2, 7, 2, 6, 1, 5, 5, 4, 4, 5, 9, 3, 3, 8, 7, 9, 0, 7, 1, 2, 9, 1, 2, 3, 8, 7, 5, 0, 8, 0, 8, 0, 9, 2, 6, 0, 7, 2, 6, 4, 9, 6, 7, 3, 4, 6, 4, 6, 3, 6, 9, 2, 7, 3, 5, 7, 1, 2, 7, 9, 5, 7, 1, 4, 0, 7, 7, 9, 1, 3, 3, 1, 1, 2, 4, 5, 9, 0, 4, 4, 6, 3, 7, 6, 8, 4, 3, 1, 7, 1, 2, 2, 8, 3, 6, 0, 1, 5, 0, 2, 1, 5, 5, 2, 0, 9, 0, 1, 0, 4, 5, 8, 7, 2, 4, 7, 7, 0, 9, 6, 1, 1, 8, 1, 5, 6, 4, 8, 2, 4, 0, 3, 1, 6, 5, 1, 7, 7, 4, 9, 1, 0, 0, 0, 4, 6, 8, 3, 6, 7, 9, 9, 0, 9, 3, 5, 6, 7, 3, 8, 3, 6, 3, 4, 4, 0, 8, 1, 8, 2, 3, 1, 4, 3, 2, 9, 1, 0, 4, 8, 9, 4, 9, 9, 3, 2, 7, 1, 9, 0, 1, 4, 8, 4, 9, 2, 7, 9, 6, 5, 1, 1, 6, 8, 4, 0, 9, 7, 2, 3, 5, 1, 9, 7, 3, 5, 9, 0, 6, 1, 2, 8, 5, 1, 4, 6, 5, 1, 5, 3, 8, 9, 4, 7, 7, 0, 9, 6, 8, 2, 9, 3, 5, 9, 2, 8, 4, 2, 0, 2, 5, 3, 2, 2, 6, 7, 9, 3, 0, 6, 7, 1, 5, 1, 0, 2, 2, 9, 0, 2, 1, 2, 7, 7, 3, 0, 7, 9, 4, 8, 1, 9, 3, 4, 1, 1, 3, 2, 6, 3, 9, 3, 6, 6, 7, 6, 1, 1, 6, 1, 3, 9, 3, 2, 6, 8, 2, 6, 7, 6, 4, 1, 5, 9, 5, 9, 2, 0, 3, 8, 5, 2, 4, 2, 9, 3, 8, 0, 6, 6, 3, 1, 6, 9, 3, 2, 7, 6, 0, 7, 2, 6, 8, 0, 5, 5, 9, 9, 5, 4, 8, 0, 7, 4, 2, 8, 9, 3, 0, 5, 9, 3, 6, 5, 4, 9, 0, 2, 7, 2, 9, 0, 9, 9, 2, 6, 4, 3, 6, 9, 7, 6, 1, 6, 0, 6, 4, 9, 9, 6, 6, 0, 2, 2, 6, 6, 3, 8, 8, 1, 0, 9, 3, 9, 8, 5, 6, 4, 8, 4, 3, 5, 0, 7, 2, 2, 3, 8, 3, 2, 5, 9, 2, 7, 1, 0, 5, 6, 0, 4};
clock_t begin, end;
double time_spent;
begin = clock();
/* here, do your time-consuming job */
#pragma omp parallel for private(temp)
for(j=0;j<1000;j++){
temp = arr[j];
for(i=0;i<temp;temp--)
result[j]=result[j]*temp;
}
end = clock();
time_spent = (double)(end - begin)/CLOCKS_PER_SEC;
printf("\n\n%f",time_spent);
但是每次运行代码时,都会得到不同的输出。我想看看代码的性能对于openmp和serial code是如何不同的。我应该用什么方法来实现相同?
除非你使用MSVC或MinGW(但不是MinGW-w64),否则不要使用clock()。我建议你使用'omp_get_wtime()',因为它可以满足所有编译器的需求。 –