我有一个文件“frequencies.xml”,其包含与此表格线:取下xml文件行,如果包含相同字(perl的)
<?xml version="1.0"?>
<!DOCTYPE stationlist PUBLIC "-//xxxxx//DTD stationlist 1.0//EN" "http://xxxxxxxxx/DTD/xxxxxxxx.dtd">
<frequencies xmlns="http://xxxxxxxxxxxxxxxx/DTD/">
<list norm="PAL" frequencies="Custom" audio="bg">
..............................................................
<station name="A" active="1" channel="48.25MHz" norm="PAL"/>
<station name="B" active="1" channel="55.25MHz" norm="PAL"/>
<station name="C" active="1" channel="62.25MHz" norm="PAL"/>
<station name="D" active="1" channel="112.25MHz" norm="PAL"/>
..............................................................
<station name="E" active="1" channel="119.25MHz" norm="PAL"/>
<station name="F" active="0" channel="48.25MHz" norm="PAL"/>
..............................................................
<station name="G" active="1" channel="55.25MHz" norm="PAL"/>
<station name="H" active="0" channel="62.25MHz" norm="PAL"/>
..............................................................
</list>
</frequencies>
我想删除线视为重复,如果包含具有相同频率的其他线路。
输出结果:
<station name="A" active="1" channel="48.25MHz" norm="PAL"/>
<station name="B" active="1" channel="55.25MHz" norm="PAL"/>
<station name="C" active="1" channel="62.25MHz" norm="PAL"/>
<station name="D" active="1" channel="112.25MHz" norm="PAL"/>
<station name="E" active="1" channel="119.25MHz" norm="PAL"/>
我写的脚本来做到这一点:
for i in `cat frequencies.xml | sed 's/.*channel="\([^"]*\)".*/\1/; /</ d' |grep MHz`; do
cat frequencies.xml | awk -v i="channel=\"$i" '
BEGIN { a=0 }
$0 ~ i { if (a == "1") { print i"\" - duplicate" > "/dev/stderr" ; next ;} ; a=1 }
{ print $_ }' > frequencies.xml.tmp && \
mv frequencies.xml.tmp frequencies.xml
done
如何在Perl语言调换呢?
谢谢
更新:我想保留XML结构。
我的代码:
open (FH, "+< frequencies.xml") or die "Opening: $!";
my $out = '';
my %seen =();
foreach my $line (<FH>) {
if ($line =~ m/<station/) {
my ($freq) = ($line =~ m/channel="([^"]+)"/);
$out .= $line unless $seen{$freq}++;
} else {
$out .= $line;
}
}
seek(FH,0,0) or die "Seeking: $!";
print FH $out or die "Printing: $!";
truncate(FH, tell(FH)) or die "Truncating: $!";
close(FH) or die "Closing: $!";
它工作正常。谢谢。但如何保持XML标题? – user1148015 2012-01-13 16:52:25
'print $ line除非$ seen {$ freq} ++;'也可以使用 – Zaid 2012-01-13 16:54:14