2017-03-29 64 views
2

虽然我以前使用过命令提示符/终端,但我对AWK很新。使用IF语句时AWK语法错误

我有以下脚本,我正在创建基于国家代码和州代码的数据子集。但是我得到一个语法错误。

BEGIN{ 
    FS = "\t" 
    OFS = "\t" 
    } 

# Subset data from the states you need for all years 
if ($5 == "IN-GA" || $5 == "IN-DD" || $5 == "IN-DN" || $5 == "IN-KA" || $5 == "IN-KL" || $5 == "IN-MH" || $5 == "IN-TN" || $5 == "IN-GJ"){ 
     if (substr($17, 1, 4) == "2000"){ 
      print $5, $12, $13, $14, $15, $16, $17, $22, $23, $24, $25, $26, $28 > "Y2000_India_sampling_output.txt" 
     } 
    } 

在Cygwin,我指的是剧本,我运行的代码下面的行,你立即看到语法错误:

$ gawk -f sampling_India.awk sampling_relFeb-2017.txt 
gawk: sampling_India.awk:20: gawk if ($5 == "IN-GA" || $5 == "IN-DD" || $5 == "IN-DN" || $5 == "IN-KA" || $5 == "IN-KL" || $5 == "IN-MH" || $5 == "IN-TN" || $5 == "IN-GJ"){ 
gawk: sampling_India.awk:20:  ^syntax error 

有什么想法?

回答

2

您的if条件未包含在{...}区块中。

有这样的:

BEGIN { 
    FS = OFS = "\t" 
} 
# Subset data from the states you need for all years 
$5 ~ /^IN-(GA|DD|DN|KA|KL|MH|TN|GJ)$/ && substr($17, 1, 4) == "2000" { 
    print $5, $12, $13, $14, $15, $16, $17, $22, $23, $24, $25, $26, $28 > "Y2000_India_sampling_output.txt" 
} 

说明如何使用正则表达式,你可以多==条件合并成一个条件。

+1

谢谢@anubhava。这样可行!。我好奇。如果我不想在2000年对它进行分类,并且删除'&& substr($ 17,1,4)==“2000”' - 我应该获取所有涉及相关状态的数据吗?尽管所有年份? –

+0

是的,这是正确的 – anubhava