Python：使用子进程流式传输数据而没有死锁？

我正在编写一个用于洗牌大量数据的小脚本。这件事情是这样的：Python：使用子进程流式传输数据而没有死锁？

outproc = None 
for input in input_files: 
    p = Popen('process_input "%s" | more_input_processing' %(input,), 
       shell=True, stdout=PIPE) 
    for line in p.stdout.xreadlines(): 
     if linecount % 1000000 == 0: 
      outfile = "output%03d" %(linecount // 1000000,) 
      if outproc: 
       outproc.stdin.close() 
       result = outproc.wait() # <-- deadlock here 
       assert result == 0, "outproc exited with %s" %(result,) 
      outproc = Popen('handle_output "%s"' %(outfile,), 
          shell=True, stdin=PIPE) 
     linecount += 1 
     outproc.stdin.write(line) 
    p.stdout.close() 
    result = p.wait() 
    assert result == 0, "p exited with %s" %(result,)

如文档警告说，虽然，我打一个僵局时，我尝试等待outproc（见注释）。

本文档提出的“解决方案”是使用.communicate() ......但这样做会涉及在刷新之前将所有输入读入内存，这是不可取的。

那么，我怎样才能在没有死锁的子流程之间流数据？

来源

2010-12-05 David Wolever

好的，所以如果我实际上没有在子进程中等待（即删除所有对`.wait（）`）的调用，一切似乎都有效，这对于这个脚本来说很好（这只是一次性的）。如果我能弄清楚如何使它正常工作，那将是很好的，但是... – 2010-12-05 22:35:25