我需要刮这个HTML页面...获取分辩Xpath的HTML元素
http://www1.usl3.toscana.it/default.asp?page=ps&ospedale=3
....使用PHP和XPath得到像值下字符串“CODICE BIANCO”
(注意:如果您尝试浏览它,则可以在该页面看到不同的值......无所谓......,它们正在改变...)
我使用这个PHP代码示例打印的价值...
<?php
ini_set('display_errors', 'On');
error_reporting(E_ALL);
include "./tmp/vendor/autoload.php";
$url = 'http://www1.usl3.toscana.it/default.asp?page=ps&ospedale=3';
//$xpath_for_parsing = '/html/body/div/div[2]/table[2]/tbody/tr[1]/td/table/tbody/tr[3]/td[1]/table/tbody/tr[11]/td[3]/b';
$xpath_for_parsing = '//*[@id="contentint"]/table[2]/tbody/tr[1]/td/table/tbody/tr[3]/td[1]/table/tbody/tr[11]/td[3]/b';
//#Set CURL parameters: pay attention to the PROXY config !!!!
$ch = curl_init();
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_PROXY, '');
$data = curl_exec($ch);
curl_close($ch);
$dom = new DOMDocument();
@$dom->loadHTML($data);
$xpath = new DOMXPath($dom);
$colorWaitingNumber = $xpath->query($xpath_for_parsing);
$theValue = 'N.D.';
foreach($colorWaitingNumber as $node)
{
$theValue = $node->nodeValue;
}
print $theValue;
?>
我同时使用Chrome和Firefox网页控制台提取的XPath ...
建议/例子?
它现在正在工作......任何替代工具来为我的HTML页面元素提取xpath? – Cesare