xpath
  • hpple
  • 2013-03-27 15 views 0 likes 
    0

    Hpple XPath查询的问题,我有以下的HTML片段为正方体HOCR输出

     <span class='ocr_line' id='line_11' title="bbox 0 482 377 539"> 
    <span class='ocrx_word' id='word_34' title="bbox 0 484 51 539"><em>WORD1</em></span> 
    <span class='ocrx_word' id='word_35' title="bbox 56 482 119 528">WORD2</span> 
    <span class='ocrx_word' id='word_35' title="bbox 56 482 119 528"><em></em></span> 
    <span class='ocrx_word' id='word_36' title="bbox 137 483 171 528"><strong><em>WORD3</em></strong></span> 
    <span class='ocrx_word' id='word_37' title="bbox 176 482 244 528"><h1>WORD4</h1></span> 
    </span> 
    

    我想XPath查询字符串抢出来的BBOX字符串和字1-4节点内容。我遇到了麻烦,因为单词与<em> s和<strong> s嵌套,可能也是空的!谢谢。

    回答

    0

    也许://@title | //text()

    相关问题