2013-10-13 96 views
0

我有以下Python代码:如何减少时间在python循环?

H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]] 
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]  

D1 = [0.01,0.02,0.1,0.01]  
D2 = [0.1,0.3,0.01,0.4] 

Tp = np.sum(D1)  
Tn = np.sum(D2) 

T = []  
append2 = T.append 
E = []  
append3 = E.append 

for h1,h2 in itertools.izip(H1,H2) 
    Err = []  
    append1 = Err.append 
    for v in h1:  

     L1 = [1 if i>=v else 0 for i in h1]  
     L2 = [1 if i>=v else 0 for i in h2]  

     Sp = np.dot(D1,L1)  
     Sn = np.dot(D2,L2)  

     err = min(Sp+Tn-Sn, Sn+Tp-Sp)  
     append1(err) 

    b = np.argmin(Err)  
    append2(h1[b])  
    append3(Err[b]) 

这仅仅是一个示例代码。我需要运行内部循环近20000次(这里只运行两次)。但内循环花费很多时间使其不实用。 在行剖析器中,它显示行Sp = np.dot(D1,L1)Sn = np.dot(D2,L2)b = np.argmin(Err)是最耗时的。 如何缩短上述代码所花费的时间。

任何帮助将不胜感激。

谢谢!

+0

什么是您的硬件?根据计算机的不同,可以使用多处理(因为代码可并行化),甚至可以使用CUDA进行GPU计算。 – lucasg

+0

@georgesl:嗨,我已经在使用多处理。我需要为5个不同的H1和H2运行以上代码。我的H1再次包含大约20,000个列表,即外部for循环运行20,000次。 – user2766019

+0

使用CUDA,它死了很快*我认为它有那些python绑定* –

回答

1

你需要保持在ndarray类型的数据。当你在列表上做一个numpy操作时,它必须每次构造一个新的数组。我修改了你的代码来运行可变的次数,并且在10000次迭代中发现它也是-1秒。将数据类型更改为ndarrays减少了大约两倍,我认为还有一些改进要做(第一个版本有一个bug使其执行得太快)

import itertools 
import numpy as np 
N = 10000 
H1 = [np.array([0.04,0.03,0.01,0.002])] * N 
H2 = [np.array([0.06,0.02,0.02,0.004])] * N 

D1 = np.array([0.01,0.02,0.1,0.01] ) 
D2 = np.array([0.1,0.3,0.01,0.4]) 

Tp = np.sum(D1)  
Tn = np.sum(D2) 

T = []  
append2 = T.append 
E = []  
append3 = E.append 

for h1,h2 in itertools.izip(H1,H2): 
    Err = []  
    append1 = Err.append 
    for v in h1: 

     #L1 = [1 if i>=v else 0 for i in h1]  
     #L2 = [1 if i>=v else 0 for i in h2]  
     L1 = h1 > v 
     L2 = h2 > v 
     Sp = np.dot(D1,L1)  
     Sn = np.dot(D2,L2)  

     err = min(Sp+Tn-Sn, Sn+Tp-Sp)  
     append1(err) 

    b = np.argmin(Err)  
    append2(h1[b])  
    append3(Err[b]) 
+0

谢谢,这项工作,并减少了大量的时间。你说,还有一些改进。那是什么?我仍然需要稍微降低时间因素。 – user2766019

+0

我看到的另一种方法是加速使用广播阵列操作来移除内部循环。例如'h1> h1.reshape(len(h1,1))'会并行计算所有的'L1'向量(我认为)。点积和np.min可以在任何你想要的轴上操作。 – Evan

+0

谢谢!我也很乐意尝试这个。 – user2766019

1

有一些唾手可得在列表解析:

L1 = [1 if i>=v else 0 for i in h1] 
L2 = [1 if i>=v else 0 for i in h2] 

以上可以写成:

L1 = [i>=v for i in h1] 
L2 = [i>=v for i in h2] 

因为布尔是整数的一个子类,TrueFalse已经1和0,只穿着华丽的衣服。

err = min(Sp+Tn-Sn, Sn+Tp-Sp)  
append1(err) 

您可以结合上述两行来避免变量赋值和访问。

如果你把代码放在一个函数中,所有局部变量的用法会稍微快一点。此外,您使用的任何全局函数或方法(例如minnp.dot)都可以使用默认参数转换为函数签名中的局部变量。 np.dot是一个特别缓慢的调用(除了操作本身需要多长时间),因为它涉及属性查找。这与您已使用列表append方法进行的优化类似。

现在我想象这些都不会真的影响性能,因为你的问题似乎真的是“我怎么能让NumPy更快?” (其他人对你而言是最重要的),但他们可能会有一些影响,值得去做。

3

如果您将numpy函数与numpy数组而不是列表一起使用,您可以获得相当大的速度提升。大多数numpy函数会在内部将列表转换为数组,这会给运行时增加很多开销。下面是一个简单的例子:

In [16]: a = range(10) 

In [17]: b = range(10) 

In [18]: aa = np.array(a) 

In [19]: bb = np.array(b) 

In [20]: %timeit np.dot(a, b) 
10000 loops, best of 3: 54 us per loop 

In [21]: %timeit np.dot(aa, bb) 
100000 loops, best of 3: 3.4 us per loop 

numpy.dot运行速度16倍时在这种情况下,阵列调用。另外,当你使用numpy数组时,你将能够简化一些你的代码,这也有助于它运行得更快。例如,如果h1是一个数组,L1 = [1 if i>=v else 0 for i in h1]可以写为h1 > v,它返回一个数组并且运行速度也应该更快。贝娄我已经走了,用数组替换了你的列表,这样你就可以看到它的样子。

import numpy as np 

H1 = np.array([[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]]) 
H2 = np.array([[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]) 

D1 = np.array([0.01,0.02,0.1,0.01]) 
D2 = np.array([0.1,0.3,0.01,0.4]) 

Tp = np.sum(D1)  
Tn = np.sum(D2) 

T = np.zeros(H1.shape[0]) 
E = np.zeros(H1.shape[0]) 

for i in range(len(H1)): 
    h1 = H1[i] 
    h2 = H2[i] 
    Err = np.zeros(len(h1)) 

    for j in range(len(h1)):  
     v = h1[j] 

     L1 = h1 > v 
     L2 = h2 > v 

     Sp = np.dot(D1, L1)  
     Sn = np.dot(D2, L2)  

     err = min(Sp+Tn-Sn, Sn+Tp-Sp)  
     Err[j] = err 

    b = np.argmin(Err) 
    T[i] = h1[b] 
    E[i] = Err[b] 

一旦你更舒适numpy的阵列,您可能要考虑使用broadcasting您的内环表达至少。对于某些应用程序,使用广播可能比python循环更高效。祝你好运,希望有所帮助。

0

如果我已经正确理解维度1的两个列表上的指令np.dot()的情况,那么在我看来,下面的代码应该和你的一样。
请问你能测试它的速度吗?

其原理是对指数的,而不是列出的元素发挥,并使用列表中定义为一个函数的默认值特殊性

H1 = [[0.04,0.03,0.01,0.002],[0.02,0.04,0.001,0.5]] 
H2 = [[0.06,0.02,0.02,0.004],[0.8,0.09,0.6,0.1]]  

D1 = [0.01,0.02,0.1,0.01]  
D2 = [0.1,0.3,0.01,0.4] 

Tp = np.sum(D1)  
Tn = np.sum(D2) 

T,E = [],[]  
append2 = T.append  
append3 = E.append 

ONE,TWO = [],[] 

def zoui(v, ONE=ONE,TWO=TWO, 
     D1=D1,D2=D2,Tp=Tp,Tn=Tn,tu= (0,1,2,3)): 
    diff = sum(D1[i] if ONE[i]>=v else 0 for i in tu0123)\ 
      -sum(D2[i] if TWO[i]>=v else 0 for i in tu0123) 
    #or maybe 
    #diff = sum(D1[i] * ONE[i]>=v for i in tu0123)\ 
    #  -sum(D2[i] * TWO[i]>=v for i in tu0123) 

    return min(Tn+diff,Tp-diff) 

for n in xrange(len(H1)): 
    ONE[:] = H1[n] 
    TWO[:] = H2[n] 
    Err = map(zoui,ONE) 
    b = np.argmin(Err)  
    append2(ONE[b])  
    append3(Err[b])