2012-10-02 120 views
1

我有这个文件(dev1.temp):巴什 - 提取URL从XML文件

<?xml version="1.0" encoding="UTF-8"?> 
<krpano version="1.0.8.15" showerrors="false"> 

      <include url="include/sa/index.xml" /> <include url="content/sa.xml" /> 
      <include url="include/global/index.xml" /> 
      <include url="include/orientation/index.xml" /> 
      <include url="include/movecamera/index.xml" /> <include url="content/movecamera.xml" /> 
      <include url="include/fullscreen/index.xml" /> 
      <include url="include/instructions/index.xml" /> 
      <include url="include/coordfinder/index.xml" /> 
      <include url="include/editor_and_options/index.xml" /> 
</krpano> 

的目标是让所有的URL的内容,并把它们放到一个临时文件(devel.temp)。输出将是:

include/sa/index.xml 
content/sa.xml 
include/global/index.xml 
include/orientation/index.xml 
include/movecamera/index.xml 
content/movecamera.xml 
include/fullscreen/index.xml 
include/instructions/index.xml 
include/coordfinder/index.xml 
include/editor_and_options/index.xml 

这样的伎俩,我有以下脚本:

# Make a temp file with all the files url's  
grep -o 'url=['"'"'"][^"'"'"']*['"'"'"]' $temp_folder"/devel1.temp" > $temp_folder"/devel2.temp" 
# Strip off everything to leave just the url's'  
sed -e 's/^url=["'"'"']//' -e 's/["'"'"']$//' $temp_folder"/devel2.temp" > $temp_folder"/devel.temp" 

昨天它完美地工作。今天,devel2.temp和devel.temp输出是这样的:

[01;31m[Kurl="include/sa/index.xml"[m[K 
[01;31m[Kurl="content/sa.xml"[m[K 
[01;31m[Kurl="include/global/index.xml"[m[K 
[01;31m[Kurl="include/orientation/index.xml"[m[K 
[01;31m[Kurl="include/movecamera/index.xml"[m[K 
[01;31m[Kurl="content/movecamera.xml"[m[K 
[01;31m[Kurl="include/fullscreen/index.xml"[m[K 
[01;31m[Kurl="include/instructions/index.xml"[m[K 
[01;31m[Kurl="include/coordfinder/index.xml"[m[K 
[01;31m[Kurl="include/editor_and_options/index.xml"[m[K 

这是怎么回事任何想法?

回答

3

看起来像grep正在使用ANSI序列为其输出着色,即使输出不是终端。将其--coloralways更改为auto

+0

点上。问题解决了!谢谢!!! – RafaelGP

2

除了choroba的评论re。你的ANSI序列,我会尽可能地避免通过sed等解析XML,并寻找使用XML感知的脚本工具。我使用XMLStarlet toolkit。这意味着你的脚本是字符编码/实体感知的,并且在改变XML的时候更加健壮。

2

考虑使用xml目标工具,例如xpath。我建议这样的:

xpath -e "/krpano/include/@url" -q yourFile.xml | cut -f 2 -d "=" | sed 's/"// 

如果你确保XML将有krpano根只有include的有url属性。您也可以使用下面的简写,但上面的运行速度会更快。

xpath -e "//@url" -q yourFile.xml | cut -f 2 -d "=" | sed 's/"// 
1

第三个XML脚本感知工具是我Xidel

xidel /tmp/your.xml -e //@url 

(与大多数它支持XPath 2.0,尽管这是矫枉过正这个问题)