2016-11-06 46 views
6

我使用的多道库产卵两个子进程。我想确保只要父进程还活着,如果子进程死了(接收SIGKILL或SIGTERM),它们就会自动重启。另一方面,如果父进程收到一个SIGTERM/SIGINT,我希望它终止所有的子进程然后退出。Python的多 - 捕捉信号重新启动子进程或关闭父进程

这是我走近这个问题:

import sys 
import time 
from signal import signal, SIGINT, SIGTERM, SIGQUIT, SIGCHLD, SIG_IGN 
from functools import partial 
import multiprocessing 
import setproctitle 

class HelloWorld(multiprocessing.Process): 
    def __init__(self): 
     super(HelloWorld, self).__init__() 

     # ignore, let parent handle it 
     signal(SIGTERM, SIG_IGN) 

    def run(self): 

     setproctitle.setproctitle("helloProcess") 

     while True: 
      print "Hello World" 
      time.sleep(1) 

class Counter(multiprocessing.Process): 
    def __init__(self): 
     super(Counter, self).__init__() 

     self.counter = 1 

     # ignore, let parent handle it 
     signal(SIGTERM, SIG_IGN) 

    def run(self): 

     setproctitle.setproctitle("counterProcess") 

     while True: 
      print self.counter 
      time.sleep(1) 
      self.counter += 1 


def signal_handler(helloProcess, counterProcess, signum, frame): 

    print multiprocessing.active_children() 
    print "helloProcess: ", helloProcess 
    print "counterProcess: ", counterProcess 

    if signum == 17: 

     print "helloProcess: ", helloProcess.is_alive() 

     if not helloProcess.is_alive(): 
      print "Restarting helloProcess" 

      helloProcess = HelloWorld() 
      helloProcess.start() 

     print "counterProcess: ", counterProcess.is_alive() 

     if not counterProcess.is_alive(): 
      print "Restarting counterProcess" 

      counterProcess = Counter() 
      counterProcess.start() 

    else: 

     if helloProcess.is_alive(): 
      print "Stopping helloProcess" 
      helloProcess.terminate() 

     if counterProcess.is_alive(): 
      print "Stopping counterProcess" 
      counterProcess.terminate() 

     sys.exit(0) 



if __name__ == '__main__': 

    helloProcess = HelloWorld() 
    helloProcess.start() 

    counterProcess = Counter() 
    counterProcess.start() 

    for signame in [SIGINT, SIGTERM, SIGQUIT, SIGCHLD]: 
     signal(signame, partial(signal_handler, helloProcess, counterProcess)) 

    multiprocessing.active_children() 

如果我发送SIGKILL到counterProcess,它会正常重新启动。但是,向helloProcess发送SIGKILL也会重新启动counterProcess而不是helloProcess?

如果我发送一个SIGTERM父进程,父将退出,但子进程成为孤儿和继续。我如何纠正这种行为?

回答

1

signal.SIGCHLD处理程序重新死去的孩子,母亲必须调用os.wait功能之一,因为Process.is_alive不在这里工作了。
尽管可能,但它很复杂,因为signal.SIGCHLD被送到母亲,当它的一个孩子状态改变f.e. signal.SIGSTOP,signal.SIGCONT或任何其他终止信号由孩子接收。
所以signal.SIGCHLD处理程序必须区分孩子的这些状态。仅仅在signal.SIGCHLD交付时重新创建孩子可能会创造出超过必要的孩子。

下面的代码使用os.waitpidos.WNOHANG使其无阻塞os.WUNTRACEDos.WCONTINUED学习如果signal.SIGCHLDsignal.SIGSTOPsignal.SIGCONT
os.waitpid不起作用,即如果Process实例中的任何一个为print ed,即返回(0, 0),即在调用os.waitpid之前返回str(Process())

import sys 
import time 
from signal import signal, pause, SIGINT, SIGTERM, SIGQUIT, SIGCHLD, SIG_DFL 
import multiprocessing 
import os 

class HelloWorld(multiprocessing.Process): 
    def run(self): 
     # reset SIGTERM to default for Process.terminate to work 
     signal(SIGTERM, SIG_DFL) 
     while True: 
      print "Hello World" 
      time.sleep(1) 

class Counter(multiprocessing.Process): 
    def __init__(self): 
     super(Counter, self).__init__() 
     self.counter = 1 

    def run(self): 
     # reset SIGTERM to default for Process.terminate to work 
     signal(SIGTERM, SIG_DFL) 
     while True: 
      print self.counter 
      time.sleep(1) 
      self.counter += 1 


def signal_handler(signum, _): 
    global helloProcess, counterProcess 

    if signum == SIGCHLD: 
     pid, status = os.waitpid(-1, os.WNOHANG|os.WUNTRACED|os.WCONTINUED) 
     if os.WIFCONTINUED(status) or os.WIFSTOPPED(status): 
      return 
     if os.WIFSIGNALED(status) or os.WIFEXITED(status): 
      if helloProcess.pid == pid: 
       print("Restarting helloProcess") 
       helloProcess = HelloWorld() 
       helloProcess.start() 

      elif counterProcess.pid == pid: 
       print("Restarting counterProcess") 
       counterProcess = Counter() 
       counterProcess.start() 

    else: 
     # mother shouldn't be notified when it terminates children 
     signal(SIGCHLD, SIG_DFL) 
     if helloProcess.is_alive(): 
      print("Stopping helloProcess") 
      helloProcess.terminate() 

     if counterProcess.is_alive(): 
      print("Stopping counterProcess") 
      counterProcess.terminate() 

     sys.exit(0) 

if __name__ == '__main__': 

    helloProcess = HelloWorld() 
    helloProcess.start() 

    counterProcess = Counter() 
    counterProcess.start() 

    for signame in [SIGINT, SIGTERM, SIGQUIT, SIGCHLD]: 
     signal(signame, signal_handler) 

    while True: 
     pause() 

以下代码在不使用signal.SIGCHLD的情况下重新创建死亡孩子。所以它比前者更简单
创建了两个孩子之后,母亲进程为SIGINT,SIGTERM,SIGQUIT设置了名为term_child的信号处理程序。 term_child在调用时终止并加入每个孩子。

母亲进程持续检查孩子是否还活着,并在while循环中重新创建它们(如有必要)。

因为每个孩子继承了母亲的信号处理器,该处理器SIGINT应重置为Process.terminate它的默认值工作

import sys 
import time 
from signal import signal, SIGINT, SIGTERM, SIGQUIT 
import multiprocessing 

class HelloWorld(multiprocessing.Process):  
    def run(self): 
     signal(SIGTERM, SIG_DFL) 
     while True: 
      print "Hello World" 
      time.sleep(1) 

class Counter(multiprocessing.Process): 
    def __init__(self): 
     super(Counter, self).__init__() 
     self.counter = 1 

    def run(self): 
     signal(SIGTERM, SIG_DFL) 
     while True: 
      print self.counter 
      time.sleep(1) 
      self.counter += 1 

def term_child(_, __): 
    for child in children: 
     child.terminate() 
     child.join() 
    sys.exit(0) 

if __name__ == '__main__': 

    children = [HelloWorld(), Counter()] 
    for child in children: 
     child.start() 

    for signame in (SIGINT, SIGTERM, SIGQUIT): 
     signal(signame, term_child) 

    while True: 
     for i, child in enumerate(children): 
      if not child.is_alive(): 
       children[i] = type(child)() 
       children[i].start() 
     time.sleep(1) 
5

没有与代码的几个问题,所以我要去了他们sequentailly。

如果我向counterProcess发送一个SIGKILL,它将正确重启。但是,向helloProcess发送SIGKILL也会重新启动counterProcess而不是helloProcess?

由于multiprocessing.active_children()实际上并不是一个行为,所以这种特殊行为很可能是由于在主进程中没有阻塞呼叫。我无法真正解释程序行为的确切原因,但在__main__函数中添加了阻塞调用,例如。

while True: 
    time.sleep(1) 

解决了这个问题。

另一个相当严重的问题是你传递对象到处理方式:

helloProcess = HelloWorld() 
... 
partial(signal_handler, helloProcess, counterProcess) 

这是obsolate,考虑创建新的对象中:

if not helloProcess.is_alive(): 
    print "Restarting helloProcess" 

    helloProcess = HelloWorld() 
    helloProcess.start() 

注意,这两个对象使用不同的别名HelloWorld()对象。部分对象绑定到__main__函数中的别名,而回调中的对象绑定到其本地范围别名。因此,通过将新对象分配给本地作用域别名,您不会真正影响回调所绑定的对象(它仍然绑定到在__main__范围内创建的对象)。

您可以通过使用新的对象重新绑定你的信号回调回调范围同样的方式解决这个问题:

def signal_handler(...): 
    ... 
    for signame in [SIGINT, SIGTERM, SIGQUIT, SIGCHLD]: 
     signal(signame, partial(signal_handler, helloProcess, counterProcess)) 
    ... 

然而,这导致了另一个陷阱,因为现在每个孩子进程将从父和访问继承回调它每次接收信号。为了解决这个问题,你可以临时将信号处理程序创建子进程之前默认权限:

for signame in [SIGINT, SIGTERM, SIGQUIT, SIGCHLD]: 
    signal(signame, SIG_DFL) 

最后,你可能想压制任何信号从你的子进程来终止他们之前,否则他们会再次触发回调:

signal(SIGCHLD, SIG_IGN) 

请注意,您马云想要的重新设计应用程序的体系结构,并利用一些multiprocessing提供的功能。

最终代码:

import sys 
import time 
from signal import signal, SIGINT, SIGTERM, SIGQUIT, SIGCHLD, SIG_IGN, SIG_DFL 
from functools import partial 
import multiprocessing 
#import setproctitle 

class HelloWorld(multiprocessing.Process): 
    def __init__(self): 
     super(HelloWorld, self).__init__() 

     # ignore, let parent handle it 
     #signal(SIGTERM, SIG_IGN) 

    def run(self): 

     #setproctitle.setproctitle("helloProcess") 

     while True: 
      print "Hello World" 
      time.sleep(1) 

class Counter(multiprocessing.Process): 
    def __init__(self): 
     super(Counter, self).__init__() 

     self.counter = 1 

     # ignore, let parent handle it 
     #signal(SIGTERM, SIG_IGN) 

    def run(self): 

     #setproctitle.setproctitle("counterProcess") 

     while True: 
      print self.counter 
      time.sleep(1) 
      self.counter += 1 


def signal_handler(helloProcess, counterProcess, signum, frame): 

    print multiprocessing.active_children() 
    print "helloProcess: ", helloProcess 
    print "counterProcess: ", counterProcess 

    print "current_process: ", multiprocessing.current_process() 

    if signum == 17: 

     # Since each new child inherits current signal handler, 
     # temporarily set it to default before spawning new child. 
     for signame in [SIGINT, SIGTERM, SIGQUIT, SIGCHLD]: 
      signal(signame, SIG_DFL) 

     print "helloProcess: ", helloProcess.is_alive() 

     if not helloProcess.is_alive(): 
      print "Restarting helloProcess" 

      helloProcess = HelloWorld() 
      helloProcess.start() 

     print "counterProcess: ", counterProcess.is_alive() 

     if not counterProcess.is_alive(): 
      print "Restarting counterProcess" 

      counterProcess = Counter() 
      counterProcess.start() 

     # After new children are spawned, revert to old signal handling policy. 
     for signame in [SIGINT, SIGTERM, SIGQUIT, SIGCHLD]: 
      signal(signame, partial(signal_handler, helloProcess, counterProcess)) 


    else: 

     # Ignore any signal that child communicates before quit 
     signal(SIGCHLD, SIG_IGN) 

     if helloProcess.is_alive(): 
      print "Stopping helloProcess" 
      helloProcess.terminate() 

     if counterProcess.is_alive(): 
      print "Stopping counterProcess" 
      counterProcess.terminate() 

     sys.exit(0) 



if __name__ == '__main__': 

    helloProcess = HelloWorld() 
    helloProcess.start() 

    counterProcess = Counter() 
    counterProcess.start() 

    for signame in [SIGINT, SIGTERM, SIGQUIT, SIGCHLD]: 
     signal(signame, partial(signal_handler, helloProcess, counterProcess)) 

    while True: 
     print multiprocessing.active_children() 
     time.sleep(1) 
+0

'signal.SIGCHLD'处理器和'multiprocessing.Process'不能很好地工作一起。在'signal.SIGCHLD'处理程序中,即使在子结束之后,Process.is_alive也返回True。 –