2013-12-15 35 views
0

我有用于乘法多项式的递归FFT algortihm,我需要用openmp对它进行并列。一些研究各地和尝试之后,我得到这个OpenMP FFT乘法加速

Complex * multiply(Complex *p1, Complex *p2) 
{ 
#pragma omp parallel 
{ 
//evaluate p1 
#pragma omp single nowait 
pFFT(n,p1,1); 

#pragma omp single nowait 
pFFT(n,p2,1); 
} 

//...multiply part etc 

} 

void pFFT(int deg, Complex *pol,int sign) 
{ 
if(deg == 1) 
    return; 

//divide polynom into two parts with even and odd coeficients 
Complex *even = new Complex [deg/2]; 
Complex *odd = new Complex [deg/2]; 


for(int i = 0;i<deg/2;i++) 
{ 
    even[i] = pol[2*i]; 
    odd[i] = pol[2*i+1]; 
} 


#pragma omp task 
pFFT(deg/2,even,sign); 
#pragma omp task 
pFFT(deg/2,odd,sign); 
#pragma omp taskwait 
//wn = n-th root of unity 
int x = lg2(deg); 
Complex wn; 
wn.re = pcos[x]; 
wn.im = sign*psin[x]; 
Complex w; 
w.re = 1; 
w.im = 0; 
Complex *ret = pol; 

Complex product; 
if(deg==2) 
{ 
     product = mul(odd,&w); 
     ret[0].re = even[0].re+product.re; 
     ret[0].im = even[0].im+product.im; 
     ret[1].re = even[0].re-product.re; 
     ret[1].im = even[0].im-product.im; 
} 
else 
    for(int i = 0;i<deg/2-1;i+=2) 
    { 
     product = mul(odd+i,&w); 
     ret[i].re = even[i].re+product.re; 
     ret[i].im = even[i].im+product.im; 
     ret[i+deg/2].re = even[i].re-product.re; 
     ret[i+deg/2].im = even[i].im-product.im; 
     w = mul(&w,&wn); 
     product = mul(odd+i+1,&w); 
     ret[i+1].re = even[i+1].re+product.re; 
     ret[i+1].im = even[i+1].im+product.im; 
     ret[i+1+deg/2].re = even[i+1].re-product.re; 
     ret[i+1+deg/2].im = even[i+1].im-product.im; 
     w = mul(&w,&wn); 
    } 
delete[] even; 
delete[] odd; 
} 

但代码比顺序版本更慢,只有加快我能做的就是,当我删除任务,并让刚刚2个线程同时计算每个多项式。我明白,有很多内存操作,但仍然有,我可以/应该做些什么。比你。

回答

0

您应该先停止并行处理。尝试使用大小16,64,256。在大小2处停止会产生太多的小任务,而且开销相对较大。