2012-09-16 165 views
0

我正在尝试使用AutoIt创建Google关键字工具刮板。 我使用下面的代码:获取网页源代码,包括javascript

#include <IE.au3> 
$oIE = _IECreate ("https://adwords.google.com/o/KeywordTool") 
sleep(20000) 
$source = _IEDocReadHTML ($oIE) 

MsgBox(0,'',$source) 

(睡眠有没有给我键入查询并单击IE窗口搜索的时候 - 在未来,我会自动执行此)

它输出的HTML源不包含结果表,尽管我可以在Firebug中看到它。 下面是我用Firebug提取的单行。

<tr __gwt_row="19" __gwt_subrow="0" class="sCT"><td class="sBS sDT sES" align="left"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1059"><div id="gwt-debug-column-SELECTION-row-19-0"><input type="checkbox" class="sML"></div></div></td><td class="sBS sDT" align="left"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1060"><div id="gwt-debug-column-KEYWORD-row-19-1"><span style="white-space:nowrap"><span></span><span><a class="sOL" gwtuirendered="gwt-uid-1089"><b>windows</b> live</a></span><span></span></span></div></div></td><td class="sBS sDT" align="left"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1062"><div id="gwt-debug-column-COMPETITION-row-19-2"><div title="0,04">Bassa</div></div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1063"><div id="gwt-debug-column-GLOBAL_MONTHLY_SEARCHES-row-19-3">20.400.000</div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1064"><div id="gwt-debug-column-AVERAGE_TARGETED_MONTHLY_SEARCHES-row-19-4">20.400.000</div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1065"><div id="gwt-debug-column-SUGGESTED_BID-row-19-5">€&nbsp;0,40</div></div></td><td class="sBS sDT aw-ti-advertiser-specific-cell" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1066"><div id="gwt-debug-column-AD_SHARE-row-19-6">-</div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1067"><div id="gwt-debug-column-AVERAGE_MONTHLY_SEARCHES_WITH_AFS-row-19-7">-</div></div></td><td class="sBS sDT aw-ti-advertiser-specific-cell" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1068"><div id="gwt-debug-column-SEARCH_SHARE-row-19-8">-</div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1069"><div id="gwt-debug-column-TARGETED_MONTHLY_SEARCHES-row-19-9"><div style="width: 108px; white-space: nowrap" dir="ltr"><div style="width: 8px;height: 16px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 16px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 13px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 13px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 16px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 13px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 13px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div></div></div></div></td><td class="sBS sDT sOS" align="left"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1070"><div id="gwt-debug-column-EXTRACTED_FROM_WEBPAGE-row-19-10">-</div></div></td></tr> 

是否有一种方式来获得完整的源代码与AutoIt的,包括用JavaScript产生了内容?

回答

0

我会使用http请求,因为它是最直接的方式来做到这一点。 它似乎给状态404的方式 编辑:该网址缺少其最后一封信,导致404状态。

#include <GUIConstantsEx.au3> 
#include <winapi.au3> 


MsgBox(0,default,get_url("https://adwords.google.com/o/KeywordTool")) 
    Func get_url($url) 

    $RequestURL = $url; 
    Global $oHTTP = ObjCreate("winhttp.winhttprequest.5.1") ; 
    $oHTTP.Open("GET", $RequestURL, False) 
    $oHTTP.Send() 
    if $oHTTP.status == 200 Then 
     Return $oHTTP.ResponseText 
    Else 
     Return "ooops... status: " & $oHTTP.status 
    EndIf 

EndFunc