SED，删除图案之间的线

-1

这是关于使用sed删除不包括具有图案的线条的图案之间的线。SED，删除图案之间的线

如果第二个模式经常出现两次或更多，我希望这些行被删除，直到最后一次出现第二个模式。

我该怎么做？

来源

2015-06-27 pradeep v

你能举个例子吗？ – Alp

显示该样品输入的样品输入和您想要的输出。 – Cyrus

如果您通过https://regex101.com/ – Nassim

要认识到的主要问题是sed在单独的行上运行，而不是在整个文件上一次运行，这意味着如果没有特殊处理，它不能从正则表达式获得多行匹配。为了一次操作整个文件，首先必须将整个文件读入内存。有很多方法可以做到这一点;其中之一是

sed '1h; 1!H; $!d; x; s/regex/replacement/' filename

这种工作方式如下：

1h # When processing the first line, copy it to the hold buffer. 
1!H # When processing a line that's not the first, append it to the hold buffer. 
$!d # When processing a line that's not the last, stop working here. 
x # If we get here, we just appended the last line to the hold buffer, so 
    # swap hold buffer and pattern space. Now the whole file is in the pattern 
    # space, where we can apply regexes to it.

我喜欢用这一个，因为它不涉及跳标签。有些seds（特别是BSD sed，与* BSD和MacOS X一起提供）在涉及到这些seds时会有点儿诡诈。

所以，剩下的就是制定一个多行的正则表达式。由于您没有指定分隔符模式，因此我假设您要删除包含START的第一行和包含END的最后一行之间的行。这可以通过

sed '1h; 1!H; $!d; x; s/\(START[^\n]*\).*\(\n[^\n]*END\)/\1\2/' filename

正则表达式不包含任何壮观;主要是你必须小心地在正确的地方使用[^\n]以避免超出行尾的贪婪匹配。

请注意，只有当文件足够小才能完全读入内存时，它才会起作用。如果不是这种情况下，我的建议是要在文件中的两个通行证使用awk：

awk 'NR == FNR && /START/ && !start { start = NR } NR == FNR && /END/ { end = NR } NR != FNR && (FNR <= start || FNR >= end)' filename filename

这种工作方式如下：由于filename传递给awk两次，awk将处理文件的两倍。 NR是总体记录（默认行），FNR从当前文件中读取的记录数。在文件第一遍中，NR和FNR是相等的，之后它们不是。所以：

# If this is the first pass over the file, the line matches the start pattern, 
# and the start marker hasn't been set yet, set the start marker 
NR == FNR && /START/ && !start { start = NR } 

# If this is the first pass over the file and the line matches the end line, 
# set the end marker to the current line (this means that the end marker will 
# always identify the last occurrence of the end pattern that was seen so far) 
NR == FNR && /END/    { end = NR } 

# In the second pass, print those lines whose number is less than or equal to 
# the start marker or greater than or equal to the end marker. 
NR != FNR && (FNR <= start || FNR >= end)

来源

2015-06-27 22:28:54 Wintermute

没有示例输入，所以猜测一个示例文件和patterns/line3 /和/ line6 /。

line1 #keep - up to 1st pattern line3 - including 
line2 #keep 
line3 #keep 
line4 #delete up to last occurence of line6 
line5 
line6a 
line7 
line6b 
line8 #delete 
line6C#keep - the last line6 
line9 #keep 
line10 #keep

没有任何暗VOO斗，但低效方法可能是：

(sed -n '1,/line3/p' file; tail -r file | sed -n '1,/line6/p' | tail -r) > file2

的file2将包含：

line1 
line2 
line3 
line6c 
line9 
line10

解释：

sed -n '1,/line3/p' file; # prints line 1 up to pattern (included) 

tail -r file | sed -n '1,/line6/p' | tail -r 
#reverse the file 
#print the lines up to pattern2 
#reverse the result

来源

2015-06-27 23:38:53 kobame

要跟进Wintermute's答案，如果你发现不匹配，你可以沿途删除块，所以你不必保持整个文件存储：

sed '/^START$/{:a;N;/.*\nEND$/d;ba}'

（对不起，会回复Wintermute的回答，但显然我仍然需要50个声望点作为这个特权）

来源

2015-06-28 01:34:03 Gumnos

SED，删除图案之间的线

回答

相关问题