使用XPath 1.0查找最小值不起作用

我想从XML文档（它实际上是一个转换为XML的HTML表格）中查找某个元素的最小值。但是，这不符合预期。使用XPath 1.0查找最小值不起作用

查询结果类似于How can I use XPath to find the minimum value of an attribute in a set of elements?中使用的那个。它看起来像这样：

/table[@id="search-result-0"]/tbody/tr[ 
    not(substring-before(td[1], " ") > substring-before(../tr/td[1], " ")) 
]

上执行的示例XML

<table class="tablesorter" id="search-result-0"> 
    <thead> 
     <tr> 
      <th class="header headerSortDown">Preis</th> 
      <th class="header headerSortDown">Zustand</th> 
     </tr> 
    </thead> 
    <tbody> 
     <tr> 
      <td width="45px">15 CHF</td> 
      <td width="175px">Ausgepack und doch nie gebraucht</td> 
     </tr> 
     <tr> 
      <td width="45px">20 CHF</td> 
      <td width="175px">Ausgepack und doch nie gebraucht</td> 
     </tr> 
     <tr> 
      <td width="45px">25 CHF</td> 
      <td width="175px">Ausgepack und doch nie gebraucht</td> 
     </tr> 
     <tr> 
      <td width="45px">35 CHF</td> 
      <td width="175px">Ausgepack und doch nie gebraucht</td> 
     </tr> 
     <tr> 
      <td width="45px">14 CHF</td> 
      <td width="175px">Gebraucht, aber noch in Ordnung</td> 
     </tr> 
     <tr> 
      <td width="45px">15 CHF</td> 
      <td width="175px">Gebraucht, aber noch in Ordnung</td> 
     </tr> 
     <tr> 
      <td width="45px">15 CHF</td> 
      <td width="175px">Gebraucht, aber noch in Ordnung</td> 
     </tr> 
    </tbody> 
</table>

查询返回以下结果：

<tr> 
<td width="45px">15 CHF</td> 
<td width="175px">Ausgepack und doch nie gebraucht</td> 
</tr> 
----------------------- 
<tr> 
<td width="45px">14 CHF</td> 
<td width="175px">Gebraucht, aber noch in Ordnung</td> 
</tr> 
----------------------- 
<tr> 
<td width="45px">15 CHF</td> 
<td width="175px">Gebraucht, aber noch in Ordnung</td> 
</tr> 
----------------------- 
<tr> 
<td width="45px">15 CHF</td> 
<td width="175px">Gebraucht, aber noch in Ordnung</td> 
</tr>

为什么有更多的节点不是一个回来了？由于只有一个最小值，所以应该只返回一个节点。有人看到查询有什么问题吗？它应该只返回包含14 CHF的节点。使用http://xpath.online-toolz.com/tools/xpath-editor.php

来源

2014-09-22 str

与此同时，我决定改用XSLT。这是我想出的样式表：

<?xml version="1.0" encoding="UTF-8"?> 
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml"> 

    <xsl:output method="text" omit-xml-declaration="yes" indent="no" encoding="UTF-8"/> 
    <xsl:strip-space elements="*"/> 

    <xsl:template match="//table[@id=\'search-result-0\']/tbody"> 
     <ul> 
      <xsl:for-each select="tr/td[@width=\'45px\']"> 
       <xsl:sort select="substring-before(., \' \')" data-type="number" order="ascending"/> 

       <xsl:if test="position() = 1"> 
        <xsl:value-of select="substring-before(., \' \')"/> 
       </xsl:if> 
      </xsl:for-each> 
     </ul> 
    </xsl:template> 

    <xsl:template match="text()"/> <!-- ignore the plain text --> 

</xsl:stylesheet>

来源

2014-09-27 12:30:56 str

你用这里只发现在那里有没有重复的值情况下，“最小”的XPath查询得到

结果，和值之前被写入节点排序;这是因为它只是将当前值substring-before(td[1], " ")与发现的第一个值substring-before(../tr/td[1], " ")进行比较。以分解的比较：

[1] not(15 > 15) 
[2] not(20 > 15) 
[3] not(25 > 15) 
[4] not(35 > 15) 
[5] not(14 > 15) 
[6] not(15 > 15) 
[7] not(15 > 15)

比较例1，图5，图6，和图7求值为真（左手侧不大于右手侧更大）。

来源

2014-09-22 21:15:19 TML

你是对的。调用节点集上的函数仅返回第一个节点的结果而不是集合。有关如何解决这个问题的任何建议？ – str 2014-09-23 10:00:59

@str我很想说这在XPath 1.0中是不可能的。你能预先操作元素吗？如果'substring-before'可以是在应用XPath表达式之前执行的一个独立步骤（这样就剩下了），那么我有一个解决方案。 – 2014-09-23 10:18:57

我同意Mathias。这*在XPath 1.0中是不可能的，无需更改输入XML。 – Tomalak 2014-09-23 10:43:07

TML已经指出为什么你当前的路径表达式不起作用，但没有提出可行的替代方案。

原因很简单，因为@Tomalak说：

我马蒂亚斯同意。在XPath 1.0中，这实际上是不可能的，不需要改变输入XML。

我加入这个答案详细说明的方式，你不得不预处理你的XML 之前寻找瑞士法郎的最低金额。请记住：这太复杂了，因为您在XPath 1.0中要求提供解决方案。使用XPath 2.0，您的问题可以通过单个路径表达式来解决。

XML设计

我觉得你的问题说明了为什么XML设计XML时实际上是必不可少的。为什么？因为你的问题归结为以下几点：你的XML的设计方式很难处理内容。更确切地说，在一个td元件是这样的：

<td width="45px">15 CHF</td>

有一个量（如数字）和一个货币，无论在td元素的文本节点中。如果您的XML输入是在一个更聪明或规范的方式设计的，它看起来像：

<td width="45px" currency="CHF">15</td>

看到区别？现在，不同类型的内容显然彼此分开。

的XPath修订

假定在新设计的XML，一个tr/td[1]元素的唯一内容是多少，通过帕维尔Minaev您使用的，可向工作XPath表达式：

/table[@id="search-result-0"]/tbody/tr[not(td[1] > ../tr/td[1])][1]

XML结果（与the tool you use测试）

<tr> 
<td width="45px">14</td> 
<td width="175px">Ausgepack und doch nie gebraucht</td> 
</tr>

为什么Pavel's expression不行，只是因为我想补充substring-before？

您已经找到答案的一部分了。它与如何在XPath 1.0函数中处理项目序列有关。

substring-before()是一个XPath 1.0函数，它需要两个参数，它们都是字符串。而且，最重要的是，如果将字符串的序列定义为substring-before()的第一个参数，则只会处理的第一个字符串，其他字符串将被忽略。

帕维尔的答案，适应了这一问题：

tr[not(td[1] > ../tr/td[1])][1]

依赖于事实，表达的第二部分，../tr/td[1]，发现的所有tbody元素tr的所有第一td子元素。不涉及函数，并且作为>的操作数的序列没有任何问题。

如果我们需要substring-before()因为文本内容实际上既是一个数（我们想要的）和货币（这是我们想忽略），我们要它环绕表达的两个部分：

tr[not(substring-before(td[1],' ') > substring-before(../tr/td[1],' '))][1]

>左侧没有问题，因为目前tr只有一个td[1]。但是在右侧，有一个序列节点，即../tr/td[1]。可悲的是，substring-before()只能够处理其中的第一个。

请参阅@TML回答这个问题的后果。

来源

2014-09-23 16:29:42

伟大的扩张和细节，Mathias。 – TML 2014-09-23 18:59:16

我明白了。由于我无法更改源文档，因此我想出了一个XSLT解决方案（请参阅我的答案）。 – str 2014-09-27 12:32:03

使用XPath 1.0查找最小值不起作用

回答

相关问题