2014-01-09 53 views
0

这是我一起工作:过滤器/排除的XPath提取

<div class="Pictures zoom"> 

<a title="Productname 1" class="zoomThumbActive" rel="{gallery: 'gallery1', smallimage: '/images/2.24198/little_one.jpeg', largeimage: '/images/76.24561/big-one-picture.jpeg'}" href="javascript:void(0)" style="border-width:inherit;"> 

<img title="Productname 1" src="/images/24.245/mini-doge-picture.jpeg" alt="" /></a> 

<a title="Productname 1" rel="{gallery: 'gallery1', smallimage: '/images/2.24203/small_one.jpeg', largeimage: '/images/9.5664/very-big-one-picture.jpeg'}" href="javascript:void(0)" style="border-width:inherit;"> 

<img title="Productname 1" src="/images/22.999/this-picture-is-very-small.jpeg" alt="" /></a> 

<div> 

使用以下XPath:

/html//div[@class='Pictures zoom']/a/@rel 

输出变为:

{gallery: 'gallery1', smallimage: '/images/2.24198/little_one.jpeg', largeimage: '/images/76.24561/big-one-picture.jpeg'} 
{gallery: 'gallery1', smallimage: '/images/2.24203/small_one.jpeg', largeimage: '/images/9.5664/very-big-one-picture.jpeg'} 

是否有可能过滤提取,所以intread以上,我只得到这些:

/images/76.24561/big-one-picture.jpeg 
/images/9.5664/very-big-one-picture.jpeg 

我只想把一切都砍你不想要的部分,并

刘康使用和substring-afterlargeimage: '之间'}

最好的问候,

回答

1

substring-before

使用XPath 1.0,这只能用于单个结果(因此您无法使用单个XPath调用来获取包含在一个文档中的所有URL)。这个查询将返回的第一个网址:

substring-before(substring-after((//@rel)[1], "largeimage: '"), "'") 

的XPath 2.0允许您运行功能轴的步骤。这个查询将返回所有网址,你正在寻找视为单个标记:

//@rel/substring-before(substring-after(., "largeimage: '"), "'") 
+0

可悲的是,我不能使用XPath 2.0,但是这是最适合我的。谢谢! –