0
我使用Python。我有100个zip文件。每个zipfile包含超过100个xmlfiles。使用xmlfiles我创建csvfiles。Python,多处理:如何优化代码?让代码更快?
from xml.etree.ElementTree import fromstring
import zipfile
from multiprocessing import Process
def parse_xml_for_csv1(data, writer1):
root = fromstring(data)
for node in root.iter('name'):
writer1.writerow(node.get('value'))
def create_csv1():
with open('output1.csv', 'w') as f1:
writer1 = csv.writer(f1)
for i in range(1, 100):
z = zipfile.ZipFile('xml' + str(i) + '.zip')
# z.namelist() contains more than 100 xml files
for finfo in z.namelist():
data = z.read(finfo)
parse_xml_for_csv1(data, writer1)
def create_csv2():
with open('output2.csv', 'w') as f2:
writer2 = csv.writer(f2)
for i in range(1, 100):
...
if __name__ == "__main__":
p1 = Process(target=create_csv1)
p2 = Process(target=create_csv2)
p1.start()
p2.start()
p1.join()
p2.join()
请告诉我,如何优化我的代码?让代码更快?
每个未压缩的xml文件有多大?你正在写的csvs? – goncalopp
goncalopp,xml文件很小(约10行)。我只需要2个CSV文件。 – Olga
我会使用lxml来完成处理,并尽可能在c级尽可能多地处理它http://lxml.de/FAQ.html#id1 –