lxml XPath位置（）不起作用

我试图通过XPath抓取页面，但无法按预期工作。lxml XPath位置（）不起作用

的页面是一样，

<tag1> 
    <tag2> 
      .... 
       <div id=article> 
        <p> stuff1 </p> 
        <p> stuff2 </p> 
        <p> ...... </p> 
        <p> stuff30 </p>

我想通过stuff30作为字符串提取stuff1。这是我的Python代码片段。

import lxml.html 
import urllib.request 

html = urllib.request.urlopen('http://www.something.com/news/blah/').read() 
root = lxml.html.fromstring(html) 

content = root.xpath('string(//div[@id="article"]/p[position()=>1 and position()<=last()]/.)')

此代码没有返回任何内容。

如果我从position()声明重写到个别元素索引，它的工作原理。

content = root.xpath('string(//div[@id="article"]/p[25]/.)')

该代码正确返回stuff25。

我不想为此运行循环。我相信有一种方法可以使我的代码与position()一起工作，但不知道我的代码中有什么问题。

来源

2016-08-31 K.K.

在'position（）=> 1'附近是否正确？不应该是'position（）> = 1'吗？ – Wickramaranga

不，看起来不起作用...下面的@ @ @马来语评论，XPath中的字符串不能用于多个节点。 –

@ K.K。它应该是'> ='。 '=>'会导致错误。 – Tomalak

那是因为你有位置（）=> 1，应位置（）> = 1

content = root.xpath('string(//div[@id="article"]/p[position()>=1 and position()<=last()]/.)')

将设置内容stuff1。

来源

2016-08-31 06:54:58

OP的附加说明：'string（// multiple/nodes）'会给你第一个节点的字符串表示（比较[文档]（https://www.w3.org/TR/xpath/#函数字符串））。不要尝试在XPath中转换为字符串，如果要使用多个节点，请使用主机语言进行转换。 – Tomalak

lxml XPath位置（）不起作用

回答

相关问题