打印多线程子进程

美好的一天！打印多线程子进程

我有一个python脚本，它会创建一个文件列表，并在multiprocess.Pool.map和线程函数处理它。线程函数使用外部可执行文件并通过subprocess.check_call调用它。这个outter可执行文件向stdout输出一些信息。

所以我有问题，阅读此输出 - 有时它搞砸，我无法从中得到任何有用的信息。我已经阅读了python中的打印和多线程，但我认为这不完全是我的问题，因为我没有明确地在我的脚本中调用print函数。

我该如何解决这个问题？谢谢。

另外，我注意到，如果我重定向输出从我的脚本文件，输出不乱的。

[更新]：

如果我运行脚本，这工作得很好：蟒蛇mp.py> mp.log

import time, argparse, threading, sys 
from os import getenv 
from multiprocessing import Pool 

def f(x): 
    cube = x*x*x 
    print '|Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut %d|'%(cube) 
    return cube 

if __name__ == '__main__': 

    #file = open('log.txt', 'w+') 
    parser = argparse.ArgumentParser(description='cube', usage='%(prog)s [options] -n') 
    parser.add_argument('-n', action='store', help='number', dest='n', default='10000', metavar = '') 
    args = parser.parse_args() 

    pool = Pool() 

    start = time.time() 
    result = pool.map(f, range(int(args.n))) 
    end = time.time() 
    print (end - start) 
    #file.close()

来源

2013-07-19 kvv

的原因是不同的进程打印到同一个终端，让你得到一个线程的一条线，不是一条线第二个线程，比第一个线程的另一行（或至少多数民众赞成在我认为你的问题看起来像关于“凌乱的输出”） – usethedeathstar

我该如何解决这个问题？锁是毫无用处的。我也尝试用sys.stdout.write替换所有打印的expr，但它也没有帮助。 – kvv

在这种情况下，我想解决的办法是让你的外部可执行文件打印输出到为每个线程log.txt的，这样它会工作 – usethedeathstar

为了避免多个并发子过程的混合输出，可以将输出重定向

from multiprocessing.dummy import Pool # use threads 
from subprocess import call 

def run(i): 
    with open('log%d.txt' % i, 'wb') as file: 
     return call(["cmd", str(i)], stdout=file) 

return_codes = Pool(4).map(run, range(10)) # run 10 subprocesses, 4 at a time

或收集输出，并从单一线程代码打印：

每个子到不同的文件的0

from functools import partial 
from multiprocessing.dummy import Pool, Queue, Process # use threads 
from subprocess import Popen, PIPE 

def run(i, output): 
    p = Popen(["cmd", str(i)], stdout=PIPE, bufsize=1) 
    for line in iter(p.stdout.readline, b''): 
     output((p.pid, line)) # collect the output 
    p.stdout.close() 
    return p.wait() 

def print_output(q): 
    for pid, line in iter(q.get, None): 
     print pid, line.rstrip() 

q = Queue() 
Process(target=print_output, args=[q]).start() # start printing thread 
return_codes = Pool(4).map(partial(run, output=q.put_nowait), 
          range(10)) # run 10 subprocesses, 4 at a time 
q.put(None) # exit printing thread

或者你可以使用锁：

from __future__ import print_function 
from multiprocessing.dummy import Pool, Lock # use threads 
from subprocess import Popen, PIPE 

def run(i, lock=Lock()): 
    p = Popen(["cmd", str(i)], stdout=PIPE, bufsize=1) 
    for line in iter(p.stdout.readline, b''): 
     with lock: 
      print(p.pid, line.rstrip()) 
    p.stdout.close() 
    return p.wait() 

return_codes = Pool(4).map(run, range(10)) # run 10 subprocesses, 4 at a time

注：print()函数用于从问题解决该问题：Why a script that uses threads prints extra lines occasionally?

为了避免混合不同的子进程的线，你可以一次收集大于一行的单位，具体取决于实际输出量。

来源

2013-07-23 16:24:42 jfs

非常感谢，锁解决方案很好！我不知道线程函数可以把锁作为参数！ – kvv

另一种合理的通用解决方案，还采用独有的文件：

import time, argparse, threading, sys 
from os import getenv, getcwd, getpid 
from os.path import join 
from multiprocessing import Pool, cpu_count 

logger = None # Will be set by init() to give a unique logger for each process in the pool 
def init(*initargs): 
    global logger 
    print(initargs) 
    lpath = getcwd() if initargs is None or len(initargs) == 0 else initargs[0] 
    name = 'log{!s}'.format(str(getpid())) 
    logger = open(join(lpath, name), mode='wt') # Get logger with unique name 


def f(x): 
    global logger 
    cube = x*x*x 
    logger.write('|Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut {}|\n'.format(cube)) 
    logger.flush() 
    return cube 

if __name__ == '__main__': 

    #file = open('log.txt', 'w+') 
    parser = argparse.ArgumentParser(description='cube', usage='%(prog)s [options] -n') 
    parser.add_argument('-n', action='store', help='number', dest='n', default='10000', metavar = '') 
    args = parser.parse_args() 

    pool = Pool(cpu_count(), init) 

    start = time.time() 
    result = pool.map(f, range(int(args.n))) 
    end = time.time() 
    print (end - start) 
    #file.close()

来源

2013-07-23 16:39:58 Jonathan

打印多线程子进程

回答

相关问题