2010-06-30 288 views
10

我使用BrB分担Ruby 1.9中的各种工作进程一个数据源,我和处理#叉叉像跟踪死锁以下:在红宝石

Thread.abort_on_exception = true 

fork do 
    puts "Initializing data source process... (PID: #{Process.pid})" 
    data = DataSource.new(files) 

    BrB::Service.start_service(:object => data, :verbose => false, :host => host, :port => port) 
    EM.reactor_thread.join 
end

工人们正在叉形如下:

8.times do |t| 
    fork do 
    data = BrB::Tunnel.create(nil, "brb://#{host}:#{port}", :verbose => false) 

    puts "Launching #{threads_num} worker threads... (PID: #{Process.pid})"  

    threads = [] 
    threads_num.times { |i| 
     threads << Thread.new { 
     while true 
      begin 
      worker = Worker.new(data, config) 

      rescue OutOfTargetsError 
      break 

      rescue Exception => e 
      puts "An unexpected exception was caught: #{e.class} => #{e}" 
      sleep 5 

      end 
     end 
     } 
    } 
    threads.each { |t| t.join } 

    data.stop_service 
    EM.stop 
    end 
end

这工作非常完美,但约10分钟的运行后,我得到了以下错误:

bootstrap.rb:47:in `join': deadlock detected (fatal) 
     from bootstrap.rb:47:in `block in ' 
     from bootstrap.rb:39:in `fork' 
     from bootstrap.rb:39:in `'

现在这个错误并没有告诉我有关死锁实际发生的地方,它只是指向EventMachine线程上的连接。

如何追溯程序锁定的位置?

+0

你试过把'Thread.exit'块结束之前? – glebm 2012-12-09 07:25:21

回答

5

它锁定在父线程中的连接,该信息是准确的。 要追踪它锁定在子线程中的位置,请尝试将线程的工作包装在timeout block中。您需要临时移除超时异常的全面救援以提升。

当前父线程尝试按顺序连接所有线程,直到每个线程完成为止。但是每个线程只会加入OutOfTargetsError。通过使用短暂的线程并将while循环移动到父项中,可以避免死锁。没有保证,但也许这样?

8.times do |t| 
    fork do 
    running = true 
    Signal.trap("INT") do 
     puts "Interrupt signal received, waiting for threads to finish..." 
     running = false 
    end 

    data = BrB::Tunnel.create(nil, "brb://#{host}:#{port}", :verbose => false) 

    puts "Launching max #{threads_num} worker threads... (PID: #{Process.pid})"  

    threads = [] 
    while running 
     # Start new threads until we have threads_num running 
     until threads.length >= threads_num do 
     threads << Thread.new { 
      begin 
      worker = Worker.new(data, config) 
      rescue OutOfTargetsError 
      rescue Exception => e 
      puts "An unexpected exception was caught: #{e.class} => #{e}" 
      sleep 5 
      end 
     } 
     end 

     # Make sure the parent process doesn't spin too much 
     sleep 1 

     # Join finished threads 
     finished_threads = threads.reject &:status 
     threads -= finished_threads 
     finished_threads.each &:join 
    end 

    data.stop_service 
    EM.stop 
    end 
end 
+0

嘿,伙计,这种方法的运气? – captainpete 2011-05-05 22:58:13

2

我有同样的问题,通过使用此代码段soved它。

# Wait for all threads (other than the current thread and 
# main thread) to stop running. 
# Assumes that no new threads are started while waiting 
def join_all 
    main  = Thread.main  # The main thread 
    current = Thread.current # The current thread 
    all  = Thread.list  # All threads still running 
    # Now call join on each thread 
    all.each{|t| t.join unless t == current or t == main } 
end 

来源:Ruby编程语言,奥赖利(2008)