2016-02-21 116 views
-2

有没有一种方法可以多线程的功能,从一次只能从列表中的5个URL?请参阅下面的代码。其python 2.7多线程python

import requests, csv, time, json, threading 
from lxml import html 
from csv import DictWriter 

All_links = ['http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.343097&longitude=-71.123046&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.398588&longitude=-71.24505&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.394319&longitude=-71.218049&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.365396&longitude=-71.23165&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.356719&longitude=-71.250479&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.385096&longitude=-71.208399&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.334146&longitude=-71.183298&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.374296&longitude=-71.182371&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA'] 

target = open('completedlinks.txt','ab') 
def get_data(each): 
    each = each.strip('\n') 
    r = requests.get(each) 
    source = json.loads(r.content) 
    the_file = open("output.csv", "ab") 
    writer = DictWriter(the_file, source[1].keys()) 
    writer.writeheader() 
    writer.writerows(source) 
    the_file.close() 
    target.write(each+'\n') 
    print each+"\n--------------------------" 


for each in All_links: 
    try: 
     get_data(each) 
    except: 
     pass 

回答

0

查看multiprocessing package。它实现了线程池,可以完成这个任务。

更新: 添加这样的事情应该工作

from multiprocessing import Pool 

def chunks(l, n): 
""" Yield successive n-sized chunks from l. """ 
    for i in xrange(0, len(l), n): 
     yield l[i:i+n] 

def threadit(threads, links): 
    for part in chunks(links, threads): 
     pool = Pool(threads) 
     for link in part: 
      pool.apply_async(getdata, args=(link,)) 
     pool.close() 
     pool.join() 

threadit(5, All_links) 
+0

我无法弄清楚,如何线程限制为“N”的数字,你可以请使用上面的代码显示? –