Linux中稍后提取模式字符串和其他模式字符串的简短方法是什么？

假设我们有一行文本存储在一个文件：Linux中稍后提取模式字符串和其他模式字符串的简短方法是什么？

// In the actual file this will be one line 
{unrelated_text1,ID:13, unrelated_text2,TIMESTAMP:1476280500,unrelated_text3}, 
{other_unrelated_text1,other_unrelated_text2,ID:25,TIMESTAMP:1476280600}, 
{ID:30,more_unrelated_text1,TIMESTAMP:1476280700}, 
{ID:40,final_unrelated_text}

我要的是这个特定的输入提取3项：

// The details, such as whether to put { character in front or not do not matter. 
// Any form of output which extracts only these 3 entries and groups them in a 
// visually nice way will do the job. 
{ID:13, TIMESTAMP:1476280500} 
{ID:25, TIMESTAMP:1476280600} 
{ID:30, TIMESTAMP:1476280700} 
// I do not want the last entry, because it does not contain timestamp field.

到目前为止最接近的命令我发现是

grep -Po {ID:[0-9]+(.+?)} input_file

它给出输出

{unrelated_text1,ID:13,unrelated_text2,TIMESTAMP:1476280500,unrelated_text3} 
{other_unrelated_text1,other_unrelated_text2,ID:25,TIMESTAMP:1476280600} 
{ID:30,more_unrelated_text1,TIMESTAMP:1476280700} 
{ID:40,final_unrelated_text}

下次改进我正在寻找的是如何从每个条目中删除unrelated_text，并删除最后一个条目。

问题：在Linux中最简单的方法是什么？

来源

2016-11-19 mercury0114

随着GNU AWK多焦RS和RT和单词边界：

$ awk -v RS='\\<(ID|TIMESTAMP):[0-9]+' 'NR%2{id=RT;next} RT{printf "{%s, %s}\n", id, RT}' file 
{ID:13, TIMESTAMP:1476280500} 
{ID:25, TIMESTAMP:1476280600} 
{ID:30, TIMESTAMP:1476280700}

以上将工作不管输入是在一行或多行，也不管你有什么其他的文本该文件所依赖的是在每个相关TIMESTAMP之前出现的ID，并且在必要时不难更改。

来源

2016-11-20 16:12:45

Linux中稍后提取模式字符串和其他模式字符串的简短方法是什么？

回答

相关问题