2015-08-31 54 views
2

我想从csv有条件地使用awk打印出行的子集,并且标题并不总是打印出来(如果它始终不打印出来会更好,但行为不尽相同)。awk并不总是打印出标题

如:

$ cat faults_main_dp_1_faultDate.csv | parallel -k -q --block 100M --pipe awk -F , '$299 > 30 {print $1 "," $5 "," $6 "," $7 "," $299}' | head | csvlook 
|-------------+-------------------+---------------------+----------------------+---------------------| 
| faultDate | assetId   | faultActiveTime  | faultActiveLatitude | ambienf    | 
|-------------+-------------------+---------------------+----------------------+---------------------| 
| 2015-03-08 | LOCOMOTIVE_ALL618 | 2015-03-08T10:52:40 | -0.40950949999999997 | 31.334039688110398 | 
| 2015-03-07 | LOCOMOTIVE_ALL618 | 2015-03-07T15:34:16 | -0.3867394   | 32.2283515930176 | 
| 2015-03-07 | LOCOMOTIVE_ALL619 | 2015-03-07T17:42:19 | -0.380149841   | 32.7088813781738 | 
| 2015-03-06 | LOCOMOTIVE_ALL618 | 2015-03-06T15:33:15 | -0.354323447   | 33.337738037109396 | 
| 2015-03-05 | LOCOMOTIVE_ALL618 | 2015-03-05T17:38:20 | -0.31172340000000004 | 32.225231170654304 | 
| 2015-03-05 | LOCOMOTIVE_ALL618 | 2015-03-05T15:24:19 | -0.302686065   | 30.0030345916748 | 
| 2015-04-20 | LOCOMOTIVE_ALL622 | 2015-04-20T19:41:22 | -0.379977226   | 31.8880805969238 | 
| 2015-04-18 | LOCOMOTIVE_ALL618 | 2015-04-18T06:59:32 | -0.38011753600000003 | 31.6899147033691 | 
| 2015-04-09 | LOCOMOTIVE_ALL623 | 2015-04-09T18:38:09 | -0.383524776   | 31.0484771728516 | 
|-------------+-------------------+---------------------+----------------------+——————————| 

但是:

$ cat faults_main_dp_1_faultDate.csv | parallel -k -q --block 100M --pipe awk -F , '$1 < "2014-03-01" {print $1 "," $5 "," $7 "," $299}' | head -n 5 | csvlook 
|-------------+-------------------+----------------------+---------------------| 
| 2014-02-28 | LOCOMOTIVE_ALL619 | -0.369633675   |      | 
|-------------+-------------------+----------------------+---------------------| 
| 2014-02-28 | LOCOMOTIVE_ALL619 | -0.375370562   |      | 
| 2014-02-28 | LOCOMOTIVE_ALL620 | -0.291365266   | 23.3389568328857 | 
| 2014-02-27 | LOCOMOTIVE_ALL618 | -0.38014966200000005 | 30.008481979370103 | 
| 2014-02-27 | LOCOMOTIVE_ALL618 | -0.38014966200000005 | 31.7841949462891 | 
|-------------+-------------------+----------------------+——————————| 

或:

$ cat faults_main_dp_1_faultDate.csv | parallel -k -q --block 100M --pipe awk -F , '$5 == "LOCOMOTIVE_ALL623" {print $1 "," $5 "," $7 "," $299}' | tail | csvlook 

|-------------+-------------------+----------------------+-------------------| 
| 2015-07-09 | LOCOMOTIVE_ALL623 | -0.30150732399999997 | 25.9456386566162 | 
|-------------+-------------------+----------------------+-------------------| 
| 2015-06-14 | LOCOMOTIVE_ALL623 | -0.3295847   | 34.0456199645996 | 
| 2014-08-13 | LOCOMOTIVE_ALL623 | -0.41220685799999995 |     | 
| 2014-10-20 | LOCOMOTIVE_ALL623 | -0.415138245   |     | 
| 2015-08-21 | LOCOMOTIVE_ALL623 | -0.38848757700000003 | 30.0110931396484 | 
| 2015-08-25 | LOCOMOTIVE_ALL623 | -0.41773062899999996 |     | 
| 2015-04-21 | LOCOMOTIVE_ALL623 | -0.4055466   | 36.1775779724121 | 
| 2014-02-18 | LOCOMOTIVE_ALL623 | -0.418272376   |     | 
| 2013-12-24 | LOCOMOTIVE_ALL623 | -0.3781222   | 21.1109352111816 | 
| 2015-03-13 | LOCOMOTIVE_ALL623 | -0.35584770000000004 |     | 
|-------------+-------------------+----------------------+-------------------| 
+2

既然你使用'parallel',那么你不能保证文件的第一行会先输出 –

+0

正确。所以,我所做的是首先获取标题(我将它导入到R中),然后获得没有标题的主csv,并将标题传递给创建数据框的函数。如果有一种方法可以首先抓住标题,将其打印出来,然后再进行其余的通话,那就很好。 –

回答

1

所有条件适用于所有的行,如果不满意,你会不会看到标题输出。但是,您可以做到这一点,

awk -F, 'NR==1{print;next} your condition here{your action here}' 

会打印出第一行,没有任何条件和现有的条件{}动作移动到下一行。如果您不需要完整的标题行但是需要某些字段,则还需要指定它们。

awk -F, -vOFS=, 'NR==1{print $1,$5,$7,$299;next} your condition here{your action here}' 
+0

谢谢。但为什么第一个条件打印头? “'ambienf'> 30'如何评估为真? –

+0

'awk'可以将数字转换为字符串进行比较。例如'echo“apple”| awk'$ 1> 1''将打印苹果。 – karakfa

+0

再次感谢。我对你的解决方案有个疑问:我认为它试图打印出所有的列,但我只需要打印出我指定的列的[名称]。我试着做下面的事情,但它没有做比较,也没有正确打印头文件:'awk -F,'NR == 1 {print $ 1“,”$ 5“,”$ 7“,”$ 299 ; next} $ 5 ==“LOCOMOTIVE_ALL623”{print $ 1“,”$ 5“,”$ 7“,”$ 299}“。 –