我想为本页上的每个链接追查页面后面的详细信息页面。从网站刮脸,用javascript:subOpen href链接
我能得到这个网页上的所有信息:PAGE
不过,我想获得的所有信息的详细信息页面上,但在href链接看起来像,例如:
href="javascript:subOpen('9ca8ed0fae15d43dc1257e7300345b99')"
以下是使用ImportHTML
函数获得总体概述的示例电子表格。
任何建议如何获得详细信息页面?
UPDATE
我采取了以下方法:
function doGet(e){
var base = 'http://www.ediktsdatei.justiz.gv.at/edikte/ex/exedi3.nsf/'
var feed = UrlFetchApp.fetch(base + 'suche?OpenForm&subf=e&query=%28%5BVKat%5D%3DEH%20%7C%20%5BVKat%5D%3DZH%20%7C%20%5BVKat%5D%3DMH%20%7C%20%5BVKat%5D%3DMW%20%7C%20%5BVKat%5D%3DMSH%20%7C%20%5BVKat%5D%3DGGH%20%7C%20%5BVKat%5D%3DRH%20%7C%20%5BVKat%5D%3DHAN%20%7C%20%5BVKat%5D%3DWE%20%7C%20%5BVKat%5D%3DEW%20%7C%20%5BVKat%5D%3DMAI%20%7C%20%5BVKat%5D%3DDTW%20%7C%20%5BVKat%5D%3DDGW%20%7C%20%5BVKat%5D%3DGA%20%7C%20%5BVKat%5D%3DGW%20%7C%20%5BVKat%5D%3DUL%20%7C%20%5BVKat%5D%3DBBL%20%7C%20%5BVKat%5D%3DLF%20%7C%20%5BVKat%5D%3DGL%20%7C%20%5BVKat%5D%3DSE%20%7C%20%5BVKat%5D%3DSO%29%20AND%20%5BBL%5D%3D0').getContentText();
var d = document.createElement('div'); //assuming you can do this
d.innerHTML = feed;//make the text a dom structure
var arr = d.getElementsByTagName('a') //iterate over the page links
var response = "";
for(var i = 0;i<arr.length;i++){
var atr = arr[i].getAttribute('onclick');
if(atr) atr = atr.match(/subOpen\((.*?)\)/) //if onclick calls subOpen
if(atr && atr.length > 1){ //get the id
var detail = UrlFetchApp.fetch(base + '0/'+atr[1]).getContentText();
response += detail//process the relevant part of the content and append to the reposnse text
}
}
return ContentService.createTextOutput(response);
}
然而,在运行方法时,我得到一个错误:
ReferenceError: "document" is not defined. (line 6, file "")
什么是document
的对象?
我用webapp更新了Google Spreadsheet。
我感谢您的回复!
[此问题]的副本(http://stackoverflow.com/questions/31452272/google-app-script-urlfetch-not-giving-html-output-but-javascript-ouput)。 – Mogsdad