如何从丰富的片段元素中排除内容？

我想要的丰富网页摘要数据应用到我的网页，下面http://schema.org/Article标准。其中一个属性是articleBody，我期望应该包括构成文章的整个文本。如何从丰富的片段元素中排除内容？

不幸的是，该文章的HTML表示会偶尔出现按钮，广告和其他提示，其文本不应进入articleBody。

例如：

<div itemscope itemtype="http://schema.org/Article"> 
    <div itemtype="articleBody"> 
    <p>1st Paragraph</p> 
    <p>2nd paragraph</p> 
    <a>A few useful links for my users</a> 
    <p>3rd paragraph</p> 
    <div>A few text ads</div> 
    <p>4th paragraph</p> 
    </div> 
</div>

有没有办法排除从文章本身的广告/链接文本？

来源

2013-10-17 Camelhive

请注意，您有一个错误在你的代码：'项目类型= “articleBody”'应该是' itemprop = “articleBody”'。 – unor

不，微观数据不提供一种方法来排除内容。

articleBody的value will be the textContent of the element。

丑陋“黑客”将是这个项目的指定几个articleBody属性：

<div itemscope itemtype="http://schema.org/Article"> 
    <div itemtype="articleBody"> 
    <p>1st Paragraph</p> 
    <p>2nd paragraph</p> 
    </div> 
    <a>A few useful links for my users</a> 
    <p itemtype="articleBody">3rd paragraph</p> 
    <div>A few text ads</div> 
    <p itemtype="articleBody">4th paragraph</p> 
    </div> 
</div>

但要注意，Microdata does not define how those values should be interpreted，所以它的消费者。

再丑方法：

复制的信息，包含在meta element：

<div itemscope itemtype="http://schema.org/Article"> 
    <div> 
    <p>1st Paragraph</p> 
    <p>2nd paragraph</p> 
    <a>A few useful links for my users</a> 
    <p>3rd paragraph</p> 
    <div>A few text ads</div> 
    <p>4th paragraph</p> 
    </div> 
    <meta itemtype="articleBody" content="1st Paragraph. 2nd paragraph. 3rd paragraph. 4th paragraph." /> 
</div>

来源

2014-02-01 22:09:16 unor

如何从丰富的片段元素中排除内容？

回答

相关问题