我正在测试一些代码(试图让它更快,但也试图了解差异)。我有一个循环在内存中创建一个表。然后我试图对其进行多处理,但是当我多处理时,内存使用看起来很奇怪。当我自己运行它时,表格会不断增长并不断增长,直到它占用系统上的所有内存,但是当我使用多处理时,它始终保持较低水平,这让我怀疑它在做什么。我试图快速重新创建非多处理代码。在多处理中共享项目是否具有内存限制?
下面是一些代码(只需添加/删除数据变量项,使其运行速度更快或更慢,看系统处理多道是在顶部和nonmulti是在底部。):
from multiprocessing import Pool
from multiprocessing.managers import BaseManager, DictProxy
from collections import defaultdict
class MyManager(BaseManager):
pass
MyManager.register('defaultdict', defaultdict, DictProxy)
def test(i,x, T):
target_sum = 1000
# T[x, i] is True if 'x' can be solved
# by a linear combination of data[:i+1]
#T = defaultdict(bool) # all values are False by default
T[0, 0] = True # base case
for s in range(target_sum + 1): #set the range of one higher than sum to include sum itself
#print s
for c in range(s/x + 1):
if T[s - c * x, i]:
T[s, i + 1] = True
data = [2,5,8,10,12,50]
pool = Pool(processes=2)
mgr = MyManager()
mgr.start()
T = mgr.defaultdict(bool)
T[0, 0] = True
for i, x in enumerate(data): # i is index, x is data[i]
pool.apply_async(test, (i,x, T))
pool.close()
pool.join()
pool.terminate()
print 'size of Table(with multiprocesing) is:', len(T)
count_of_true = []
for x in T.items():
if T[x] == True:
count_of_true.append(x)
print 'total number of true(with multiprocesing) is ', len(count_of_true)
#now lets try without multiprocessing
target_sum = 100
# T[x, i] is True if 'x' can be solved
# by a linear combination of data[:i+1]
T1 = defaultdict(bool) # all values are False by default
T1[0, 0] = True # base case
for i, x in enumerate(data): # i is index, x is data[i]
for s in range(target_sum + 1): #set the range of one higher than sum to include sum itself
for c in range(s/x + 1):
if T1[s - c * x, i]:
T1[s, i + 1] = True
print 'size of Table(without multiprocesing) is ', len(T1)
count = []
for x in T1:
if T1[x] == True:
count.append(x)
print 'total number of true(without multiprocessing) is ', len(count)
作为一个实验,我将两段代码放到两个文件中并且并排运行。两个multi需要大约20%,每个只使用0.5%的内存。单进程(不带多进程)使用75%的内核和高达50%的内存使用量。
你写道:“当我自己运行它时......”你是否谈过设置Pool(processes = 1)? – itsafire 2012-02-21 15:21:29
不完全。在我上面的代码中,我有两个部分,一个包装在多进程池中,另一个运行在自己的池中(没有池)。 – Lostsoul 2012-02-21 15:23:03