等待完成

为了最大限度地提高CPU使用率（我运行一个Debian莱尼在EC2的东西）我有一个简单的脚本来启动作业并行：等待完成

#!/bin/bash 

for i in apache-200901*.log; do echo "Processing $i ..."; do_something_important; done & 
for i in apache-200902*.log; do echo "Processing $i ..."; do_something_important; done & 
for i in apache-200903*.log; do echo "Processing $i ..."; do_something_important; done & 
for i in apache-200904*.log; do echo "Processing $i ..."; do_something_important; done & 
...

我很满意这个工作解决方案，但是我不知道如何编写更多的代码，只有在所有循环完成后才执行。

有没有办法控制这个？

来源

2009-07-15 mark

这里有一个bash内建命令。

wait [n ...] 
     Wait for each specified process and return its termination sta‐ 
     tus. Each n may be a process ID or a job specification; if a 
     job spec is given, all processes in that job’s pipeline are 
     waited for. If n is not given, all currently active child pro‐ 
     cesses are waited for, and the return status is zero. If n 
     specifies a non-existent process or job, the return status is 
     127. Otherwise, the return status is the exit status of the 
     last process or job waited for.

来源

2009-07-15 13:48:30 eduffy

这很快就解决了我的问题，很好，谢谢！ – mark 2009-07-15 14:03:17

使用GNU并行将会使你的脚本更短，可能更有效：

parallel 'echo "Processing "{}" ..."; do_something_important {}' ::: apache-*.log

这将运行每个CPU核心一个工作，并继续这样做，直到所有的文件进行处理。

你的解决方案基本上将作业分成组，然后再运行。这32个作业4组：

Simple scheduling

GNU并行，而不是产生一个新的进程时，一个结束 - 保持CPU的活跃，从而节省了时间：

GNU Parallel scheduling

要了解更多：

观看简介视频快速介绍： https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
浏览教程（man parallel_tutorial）。你的命令行会爱你。

来源

2014-03-28 19:31:36

这是我的粗液：

function run_task { 
     cmd=$1 
     output=$2 
     concurency=$3 
     if [ -f ${output}.done ]; then 
       # experiment already run 
       echo "Command already run: $cmd. Found output $output" 
       return 
     fi 
     count=`jobs -p | wc -l` 
     echo "New active task #$count: $cmd > $output" 
     $cmd > $output && touch $output.done & 
     stop=$(($count >= $concurency)) 
     while [ $stop -eq 1 ]; do 
       echo "Waiting for $count worker threads..." 
       sleep 1 
       count=`jobs -p | wc -l` 
       stop=$(($count > $concurency)) 
     done 
}

的想法是使用“工作”，看看有多少孩子都在后台继续运行，等到这个数字下降（孩子退出）。一旦孩子存在，下一个任务就可以开始。

正如您所看到的，还有一些额外的逻辑可以避免多次运行相同的实验/命令。它为我完成了这项工作。然而，这个逻辑可以被跳过或进一步改进（例如，检查文件创建时间戳，输入参数等）。

来源

2015-05-18 16:28:33 Radu

我不得不近日做到这一点，结束了以下解决方案：

while true; do 
    wait -n || { 
    code="$?" 
    ([[ $code = "127" ]] && exit 0 || exit "$code") 
    break 
    } 
done;

下面是它如何工作的：

wait -n退出尽快（可能很多）后台作业的出口之一。它始终评估为真，循环继续：

退出代码127：最后一个后台作业已成功退出。在这种情况下，我们忽略退出代码并退出代码为的子shell。
任何后台作业都失败。我们只是用退出代码退出子shell。

随着set -e，这将保证脚本将提前终止并通过任何失败的后台作业的退出代码。

来源

2017-05-04 07:41:18

回答

相关问题