2012-12-31 51 views
1

我是python的新手,尝试同时执行两个任务。这些任务只是在Web服务器上获取页面,并且可以在另一个之前终止。我只想在服务完所有请求后才显示结果。在linux shell中很容易,但是我无法使用python,所有的howto对于像我这样的初学者来说都像黑魔法一样。与下面的bash脚本的简单性相比,它们都让我看起来很复杂。Python中的简单多线程

这里是bash脚本,我想在python效仿:

# First request (in background). Result stored in file /tmp/p1 
wget -q -O /tmp/p1 "http://ursule/test/test.php?p=1&w=5" & 
PID_1=$! 

# Second request. Result stored in file /tmp/p2 
wget -q -O /tmp/p2 "http://ursule/test/test.php?p=2&w=2" 
PID_2=$! 

# Wait for the two processes to terminate before displaying the result 
wait $PID_1 && wait $PID_2 && cat /tmp/p1 /tmp/p2 

test.php的脚本是一个简单的:

<?php 
printf('Process %s (sleep %s) started at %s ', $_GET['p'], $_GET['w'], date("H:i:s")); 
sleep($_GET['w']); 
printf('finished at %s', date("H:i:s")); 
?> 

的bash脚本返回以下:

$ ./multiThread.sh 
Process 1 (sleep 5) started at 15:12:59 finished at 15:12:04 
Process 2 (sleep 2) started at 15:12:59 finished at 15:12:01 

我到目前为止在python 3中试过的东西:

#!/usr/bin/python3.2 

import urllib.request, threading 

def wget (address): 
    url = urllib.request.urlopen(address) 
    mybytes = url.read() 
    mystr = mybytes.decode("latin_1") 
    print(mystr) 
    url.close() 

thread1 = threading.Thread(None, wget, None, ("http://ursule/test/test.php?p=1&w=5",)) 
thread2 = threading.Thread(None, wget, None, ("http://ursule/test/test.php?p=1&w=2",)) 

thread1.run() 
thread2.run() 

像预期的那样返回这不起作用:

$ ./c.py 
Process 1 (sleep 5) started at 15:12:58 finished at 15:13:03 
Process 1 (sleep 2) started at 15:13:03 finished at 15:13:05 
+3

你想'thread1.start(); thread2.start()'然后'join'。有关线程模块的基本信息,请参阅http://docs.python.org/2/library/threading.html。现在,线程不会复制您与Bash的行为。为此,您将需要多个进程,并且您应该检查多处理模块http://docs.python.org/2/library/multiprocessing.html – mmgp

+0

它似乎与join一起工作正常。我会看看多处理。感谢您让我走上正轨。 – ripat

回答

1

而不是使用线程的,这将是很好使用多模块作为相互独立的任务。您可能想了解更多关于GIL的信息(http://wiki.python.org/moin/GlobalInterpreterLock)。

0

按照你的建议,我跳入有关多线程和多处理的文档页面,在做了几个基准测试后,我得出了多处理更适合这项工作的结论。随着线程/进程数量的增加,它会更好地扩展。我面临的另一个问题是如何存储所有这些过程的结果。使用Queue.Queue做了诀窍。这里是我想出的解决方案:

这段代码发送并发http请求到我的测试平台,在发送anwser之前暂停一秒钟(请参阅上面的php脚本)。

import urllib.request 

# function wget arg(queue, adresse) 
def wget (resultQueue, address): 
    url = urllib.request.urlopen(address) 
    mybytes = url.read() 
    url.close() 
    resultQueue.put(mybytes.decode("latin_1")) 

numberOfProcesses = 20 

from multiprocessing import Process, Queue 

# initialisation 
proc = [] 
results = [] 
resultQueue = Queue() 

# creation of the processes and their result queue 
for i in range(numberOfProcesses): 
    # The url just passes the process number (p) to the my testing web-server 
    proc.append(Process(target=wget, args=(resultQueue, "http://ursule/test/test.php?p="+str(i)+"&w=1",))) 
    proc[i].start() 

# Wait for a process to terminate and get its result from the queue 
for i in range(numberOfProcesses): 
    proc[i].join() 
    results.append(resultQueue.get()) 

# display results 
for result in results: 
    print(result)