0
从包含tei文件的字符串中,我生成一个索引来导航到它们的块,我检索所有的div标记,我也想得到如果存在的内容当前div内的标签(标签<head>
)。解析tei domxpath在评估循环中获取文本子标记
示例地文件:
<div type="lib" n="1"><head>LIBER I</head>...
<div type="pr">...</div>
<div type="cap" n="1"><head>CAP EX</head><p><milestone unit="par" n="1" />...<milestone unit="par" n="2" />...</div>
<div type="cap" n="2"><head>CAP EX</head><milestone unit="par" n="1" />...<milestone unit="par" n="2" />...</div>
</div>
我试过,但不起作用:
//source file:
$fulltext = '<div type="lib" n="1"><head>LIBER I</head>...<div type="pr">...</div><div type="cap" n="1"><head>CAP EX</head><p><milestone unit="par" n="1" />...<milestone unit="par" n="2" />...</div><div type="cap" n="2"><head>CAP EX</head><milestone unit="par" n="1" />...<milestone unit="par" n="2" />...</div></div>';
$dom = new DOMDocument();
@$dom->loadHTML($fulltext);
$domx = new DOMXPath($dom);
$entries = $domx->evaluate("//div");
echo '<ul>';
foreach ($entries as $entry){
$title = '';
type = $entry->getAttribute('type');
$n = $entry->getAttribute('n');
$head = $domx->evaluate("string(./head[1])",$entry);
if($head != '') $title = $head; else $title = $n;
echo '<li><a href="#'.$type.'-'.$n.'">'.$title.'</li>';
}
echo '</ul>';
行不起作用:
$head = $domx->evaluate("string(./head[1])",$entry);
返回错误:
DOMDocument::loadHTML(): htmlParseStartTag: misplaced <head> tag in Entity, line: 3
此行的目的是让孩子标签头的环内的文本(本例中“LIBER I”)
补充说,隐藏displaing警告错误。为什么你认为我不能在头标签内获取内容? – steplab
如果您从负载中拿走@,您会收到有关'
'标签的错误。 –它返回此:DOMDocument :: loadHTML():htmlParseStartTag:错位
实体中的标记,行:3有人知道为什么吗? – steplab