2010-04-19 116 views
7

如何防止重复条目进入列表,然后理想地对列表进行排序?我在做什么,是什么时候缺少某一级别的信息,从低于它的级别获取信息,到上面的级别构建缺失的列表。目前,我有类似XML这样:如何防止XSL中的重复项?

<c03 id="ref6488" level="file"> 
    <did> 
     <unittitle>Clinic Building</unittitle> 
     <unitdate era="ce" calendar="gregorian">1947</unitdate> 
    </did> 
    <c04 id="ref34582" level="file"> 
     <did> 
      <container label="Box" type="Box">156</container> 
      <container label="Folder" type="Folder">3</container> 
     </did> 
    </c04> 
    <c04 id="ref6540" level="file"> 
     <did> 
      <container label="Box" type="Box">156</container> 
      <unittitle>Contact prints</unittitle> 
     </did> 
    </c04> 
    <c04 id="ref6606" level="file"> 
     <did> 
      <container label="Box" type="Box">154</container> 
      <unittitle>Negatives</unittitle> 
     </did> 
    </c04> 
</c03> 

我然后采用下列XSL:

<xsl:template match="c03/did"> 
    <xsl:choose> 
     <xsl:when test="not(container)"> 
      <did> 
       <!-- If no c03 container item is found, look in the c04 level for one --> 
       <xsl:if test="../c04/did/container"> 

        <!-- If a c04 container item is found, use the info to build a c03 version --> 
        <!-- Skip c03 container item, if still no c04 items found --> 
        <container label="Box" type="Box"> 

         <!-- Build container list --> 
         <!-- Test for more than one item, and if so, list them, --> 
         <!-- separated by commas and a space --> 
         <xsl:for-each select="../c04/did"> 
          <xsl:if test="position() &gt; 1">, </xsl:if> 
          <xsl:value-of select="container"/> 
         </xsl:for-each> 
        </container>      
      </did> 
     </xsl:when> 

     <!-- If there is a c03 container item(s), list it normally --> 
     <xsl:otherwise> 
      <xsl:copy-of select="."/> 
     </xsl:otherwise> 
    </xsl:choose> 
</xsl:template> 

但我发现了的

<container label="Box" type="Box">156, 156, 154</container> 

当什么了 “容器” 的结果我想要的是

<container label="Box" type="Box">154, 156</container> 

是低是我试图得到的全部结果:

<c03 id="ref6488" level="file"> 
    <did> 
     <container label="Box" type="Box">154, 156</container> 
     <unittitle>Clinic Building</unittitle> 
     <unitdate era="ce" calendar="gregorian">1947</unitdate> 
    </did> 
    <c04 id="ref34582" level="file"> 
     <did> 
      <container label="Box" type="Box">156</container> 
      <container label="Folder" type="Folder">3</container> 
     </did> 
    </c04> 
    <c04 id="ref6540" level="file"> 
     <did> 
      <container label="Box" type="Box">156</container> 
      <unittitle>Contact prints</unittitle> 
     </did> 
    </c04> 
    <c04 id="ref6606" level="file"> 
     <did> 
      <container label="Box" type="Box">154</container> 
      <unittitle>Negatives</unittitle> 
     </did> 
    </c04> 
</c03> 

在此先感谢您的帮助!

+0

很好的问题(+1)的separator属性。查看我对XSLT 1.0解决方案的回答,该解决方案比当前选定的XSLT 2.0解决方案更短。 :) – 2010-04-20 13:26:48

回答

1

试试下面的代码:

<?xml version="1.0" encoding="UTF-8"?> 
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> 
    <xsl:output indent="yes"></xsl:output> 

<xsl:template match="node() | @*"> 
    <xsl:copy> 
    <xsl:apply-templates select="node() | @*"/> 
    </xsl:copy> 
</xsl:template> 

    <xsl:template match="c03/did"> 
    <xsl:choose> 
     <xsl:when test="not(container)"> 
     <did> 
      <!-- If no c03 container item is found, look in the c04 level for one --> 
      <xsl:if test="../c04/did/container"> 
      <xsl:variable name="foo" select="../c04/did/container[@type='Box']/text()"/> 
      <!-- If a c04 container item is found, use the info to build a c03 version --> 
      <!-- Skip c03 container item, if still no c04 items found --> 
      <container label="Box" type="Box"> 

       <!-- Build container list --> 
       <!-- Test for more than one item, and if so, list them, --> 
       <!-- separated by commas and a space --> 
       <xsl:for-each select="distinct-values($foo)"> 
       <xsl:sort /> 
       <xsl:if test="position() &gt; 1">, </xsl:if> 
       <xsl:value-of select="." /> 
       </xsl:for-each> 
      </container> 
      <xsl:apply-templates select="*" /> 
      </xsl:if> 
     </did> 
     </xsl:when> 

     <!-- If there is a c03 container item(s), list it normally --> 
     <xsl:otherwise> 
     <xsl:copy-of select="."/> 
     </xsl:otherwise> 
    </xsl:choose> 
    </xsl:template> 

</xsl:stylesheet> 

它看起来非常为你想要的输出:

<?xml version="1.0" encoding="UTF-8"?> 
<c03 id="ref6488" level="file"> 
    <did> 
     <container label="Box" type="Box">154, 156</container> 
     <unittitle>Clinic Building</unittitle> 
     <unitdate era="ce" calendar="gregorian">1947</unitdate> 
    </did> 
    <c04 id="ref34582" level="file"> 
     <did> 
     <container label="Box" type="Box">156</container> 
     <container label="Folder" type="Folder">3</container> 
     </did> 
    </c04> 
    <c04 id="ref6540" level="file"> 
     <did> 
     <container label="Box" type="Box">156</container> 
     <unittitle>Contact prints</unittitle> 
     </did> 
    </c04> 
    <c04 id="ref6606" level="file"> 
     <did> 
     <container label="Box" type="Box">154</container> 
     <unittitle>Negatives</unittitle> 
     </did> 
    </c04> 
</c03> 

的技巧是使用<xsl:sort>distinct-values()在一起。见迈克尔关键的(恕我直言)伟大的书 “XSLT 2.0和XPath 2.0”

+0

我使用的是XSLT2,所以我使用了这个解决方案,它工作得很好。唯一的问题是,我必须注释掉\ 出于某种原因,它重复了“单元标题”节点。非常感谢! – LOlliffe 2010-04-19 23:34:55

+0

我分享你对迈克尔凯的书的高度评价。不幸的是,很少有人/组织转向使用XSLT 2.0。 – 2010-05-01 16:57:04

0

以下XSLT 1.0转型做了你在找什么

<xsl:stylesheet 
    version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
> 
    <xsl:output encoding="utf-8" /> 

    <!-- key to index containers by these three distinct qualities: 
     1: their ancestor <c??> node (represented as its unique ID) 
     2: their @type attribute value 
     3: their node value (i.e. their text) --> 
    <xsl:key 
    name = "kContainer" 
    match = "container" 
    use = "concat(generate-id(../../..), '|', @type, '|', .)" 
    /> 

    <!-- identity template to copy everything as is by default --> 
    <xsl:template match="node()|@*"> 
    <xsl:copy> 
     <xsl:apply-templates select="node()|@*" /> 
    </xsl:copy> 
    </xsl:template> 

    <!-- special template for <did>s without a <container> child --> 
    <xsl:template match="did[not(container)]"> 
    <xsl:copy> 
     <xsl:copy-of select="@*" /> 
     <container label="Box" type="Box"> 
     <!-- from subordinate <container>s of type Box, use the ones 
      that are *the first* to have that certain combination 
      of the three distinct qualities mentioned above --> 
     <xsl:apply-templates mode="list-values" select=" 
      ../*/did/container[@type='Box'][ 
      generate-id() 
      = 
      generate-id(
       key(
       'kContainer', 
       concat(generate-id(../../..), '|', @type, '|', .) 
      )[1] 
      ) 
      ] 
     "> 
      <!-- sort them by their node value --> 
      <xsl:sort select="." data-type="number" /> 
     </xsl:apply-templates> 
     </container> 
     <xsl:apply-templates select="node()" /> 
    </xsl:copy> 
    </xsl:template> 

    <!-- generic template to make list of values from any node-set --> 
    <xsl:template match="*" mode="list-values"> 
    <xsl:value-of select="." /> 
    <xsl:if test="position() &lt; last()"> 
     <xsl:text>, </xsl:text> 
    </xsl:if> 
    </xsl:template> 

</xsl:stylesheet> 

返回

<c03 id="ref6488" level="file"> 
    <did> 
    <container label="Box" type="Box">154, 156</container> 
    <unittitle>Clinic Building</unittitle> 
    <unitdate era="ce" calendar="gregorian">1947</unitdate> 
    </did> 
    <c04 id="ref34582" level="file"> 
    <did> 
     <container label="Box" type="Box">156</container> 
     <container label="Folder" type="Folder">3</container> 
    </did> 
    </c04> 
    <c04 id="ref6540" level="file"> 
    <did> 
     <container label="Box" type="Box">156</container> 
     <unittitle>Contact prints</unittitle> 
    </did> 
    </c04> 
    <c04 id="ref6606" level="file"> 
    <did> 
     <container label="Box" type="Box">154</container> 
     <unittitle>Negatives</unittitle> 
    </did> 
    </c04> 
</c03> 

generate-id() = generate-id(key(...)[1])部分就是所谓的Muenchian分组。除非你可以使用XSLT 2.0,否则这是要走的路。

2

对于这个问题,没有必要使用XSLT 2.0解决方案

下面是一个XSLT 1.0溶液,其比当前选择的XSLT 2.0溶液更紧凑(35线对43行):

<xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
    <xsl:output omit-xml-declaration="yes" indent="yes"/> 
    <xsl:strip-space elements="*"/> 

    <xsl:key name="kBoxContainerByVal" 
    match="container[@type='Box']" use="."/> 

<xsl:template match="node()|@*"> 
    <xsl:copy> 
     <xsl:apply-templates select="node()|@*"/> 
    </xsl:copy> 
</xsl:template> 

<xsl:template match="c03/did[not(container)]"> 
    <xsl:copy> 

    <xsl:variable name="vContDistinctValues" select= 
    "/*/*/*/container[@type='Box'] 
      [generate-id() 
      = 
      generate-id(key('kBoxContainerByVal', .)[1]) 
      ] 
      "/> 

    <container label="Box" type="Box"> 
     <xsl:for-each select="$vContDistinctValues"> 
     <xsl:sort data-type="number"/> 

     <xsl:value-of select= 
     "concat(., substring(', ', 1 + 2*(position() = last())))"/> 
     </xsl:for-each> 
    </container> 
    <xsl:apply-templates/> 
    </xsl:copy> 
</xsl:template> 
</xsl:stylesheet> 

当在最初提供应用该变换XML文档,正确的,希望的结果是产生

<c03 id="ref6488" level="file"> 
    <did> 
     <container label="Box" type="Box">156, 154</container> 
     <unittitle>Clinic Building</unittitle> 
     <unitdate era="ce" calendar="gregorian">1947</unitdate> 
    </did> 
    <c04 id="ref34582" level="file"> 
     <did> 
     <container label="Box" type="Box">156</container> 
     <container label="Folder" type="Folder">3</container> 
     </did> 
    </c04> 
    <c04 id="ref6540" level="file"> 
     <did> 
     <container label="Box" type="Box">156</container> 
     <unittitle>Contact prints</unittitle> 
     </did> 
    </c04> 
    <c04 id="ref6606" level="file"> 
     <did> 
     <container label="Box" type="Box">154</container> 
     <unittitle>Negatives</unittitle> 
     </did> 
    </c04> 
</c03> 

更新:

我没有注意到容器号码必须出现排序的要求。现在解决方案反映了这一点

+0

您的解决方案不会对问题中要求的列表进行排序。通过在'xsl:for-each'循环中添加''轻松解决。 – markusk 2010-04-20 15:37:24

+0

@markusk:谢谢,我通常在早上很困。在这种情况下,''也需要'data-type =“number”'。 – 2010-04-20 16:05:47

1

稍微缩短XSLT 2.0版本,结合其他答案的方法。请注意,排序是按字母顺序排列的,因此如果找到标签“54”和“156”,则输出将是“156,54”。如果需要数字排序,请使用<xsl:sort select="number(.)"/>而不是<xsl:sort/>

<xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
    <xsl:output omit-xml-declaration="yes" indent="yes"/> 
    <xsl:strip-space elements="*"/> 

    <xsl:template match="node()|@*"> 
     <xsl:copy> 
      <xsl:apply-templates select="node()|@*"/> 
     </xsl:copy> 
    </xsl:template> 

    <xsl:template match="c03/did[not(container)]"> 
     <xsl:variable name="containers" 
         select="../c04/did/container[@label='Box'][text()]"/> 
     <xsl:copy> 
      <xsl:copy-of select="@*"/> 
      <xsl:if test="$containers"> 
       <container label="Box" type="Box"> 
        <xsl:for-each select="distinct-values($containers)"> 
         <xsl:sort/> 
         <xsl:if test="position() != 1">, </xsl:if> 
         <xsl:value-of select="."/> 
        </xsl:for-each> 
       </container> 
      </xsl:if> 
      <xsl:apply-templates select="node()"/> 
     </xsl:copy> 
    </xsl:template> 
</xsl:stylesheet> 
+0

+1。这仍然是最短的XSLT 2.0解决方案! :) – 2010-04-20 19:42:48

1

一个真正的XSLT 2.0解决方案,也相当短

<xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:xs="http://www.w3.org/2001/XMLSchema" 
    exclude-result-prefixes="xs" 
> 
    <xsl:output omit-xml-declaration="yes" indent="yes"/> 

    <xsl:template match="node()|@*"> 
    <xsl:copy> 
     <xsl:apply-templates select="node()|@*"/> 
    </xsl:copy> 
    </xsl:template> 

    <xsl:template match="c03/did[not(container)]"> 
    <xsl:copy> 
     <xsl:copy-of select="@*"/> 

     <xsl:variable name="vContDistinctValues" as="xs:integer*"> 
     <xsl:perform-sort select= 
      "distinct-values(/*/*/*/container[@type='Box']/text()/xs:integer(.))"> 
      <xsl:sort/> 
     </xsl:perform-sort> 
     </xsl:variable> 

     <xsl:if test="$vContDistinctValues"> 
     <container label="Box" type="Box"> 
      <xsl:value-of select="$vContDistinctValues" separator=","/> 
     </container> 
     </xsl:if> 
     <xsl:apply-templates/> 
    </xsl:copy> 
    </xsl:template> 
</xsl:stylesheet> 

待办事项:

  1. 使用的类型避免了需要指定的data-type<xsl:sort/>

  2. 使用<xsl:value-of/>

+1

+1很好完成。但是,我们知道'c03'元素是根吗?海报只是说输入是“相似的”,所以我对相对XPath(即'../ c04/container',或者'../*/ container')感觉稍微舒适一点,而不是绝对的( '/ */*/*/container')。这样,即使'c03'元素出现在文档结构的更下方,样式表也可以工作。 – markusk 2010-04-20 19:07:39

+0

@markusk再次好评,感谢upvote!是的,我们目睹了OP如何不断改变他们问题的定义。有时我很想成为算命先生,但这本身也有风险。另外,XSLT代码与实际的XML越来越不直接相关,因此更难以理解。这就是为什么在这种情况下,我通常宁愿保持尽可能接近发布的XML。我写了很多最常用的XSLT代码,例如XPath Visualizer/FXSL,但这里的目的是尽可能多地具体/有用。 :) – 2010-04-20 19:41:03