2017-09-14 42 views
0

我正在尝试更新morningstar的网站的基金规模。我以前的IE自动化尝试没有成功,所以我切换到XML httpRequest(工作速度也快得多)。现在,我无法从我从网站获得的文档中打印出正确的线条。我希望代码在第一个“td” - 标签被称为“基金规模(Mil)”的“tr” - 标签内给我第三个“td” - 标签。所以代码循环遍历所有“td”标签的标题,并且如果发现"{line heading}" = "Fund Size (Mil)"就跳转到行动。现在这是问题。我不知道如何参考这些标题。我试着将每个“td”-tag设置为一个变量(在“tr”-tag中有3个“td” - 标签,所以我对每个“td-tag”都有变量row1,row2,row3) ),但是当我现在做这个,我得到一个运行时错误438:对象不支持此属性或方法符合VBA中的HTML解析

Debug.Print TDElements.getElementsByTagName("tr")(0).Cells(0).innerHTML 

此外,当我Debug.Print TDElement.innerHTML,我没有看到我需要的“td”标签。当我最后放入.innerText时,我会看到所有“td”标签。

现在的问题是: 1)如何直接引用标题? (请参阅下面For Each循环内的注释代码行) 2)什么原因导致我没有看到带有.innerHTML的所有td-tags,但是我用.innerText看到它们?

网址 http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW Excel 2010中,IE版本11

Sub XMLhttpRequestTest2() 

'Microsoft XML, v 6.0 
'Microsoft HTML object library, used in parsing HTML 

Dim myurl As String 
Dim TDElement As Object 
Dim TDElements As IHTMLElementCollection 
Dim IE As MSXML2.XMLHTTP60 

Dim HTMLDoc As MSHTML.HTMLDocument 
Dim HTMLBody As MSHTML.HTMLBody 

Set IE = New MSXML2.XMLHTTP60 
Set HTMLDoc = New MSHTML.HTMLDocument 
Set HTMLBody = HTMLDoc.body 


myurl = "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW" 
IE.Open "GET", myurl, False 
IE.send 

HTMLBody.innerHTML = IE.responseText 

Set TDElements = HTMLDoc.getElementsByTagName("td") 
    For Each TDElement In TDElements 
     Debug.Print TDElement.innerText '.innerText/.innerHTML. Can't see the fund size with .innerHTML?? 
'  If "{line heading}" = "Fund Size (Mil)" Then 'How can I refer to headings in the html document? 
'   Worksheets("Sheet3").Range("B3") = Split("{line text}", ";")(1) 'reference to line text? 
    Next 


End Sub 

H2SO4的答案固定的上述问题。以下是对最初问题的扩展。

现在TDElement被分配了几行文本(我将h2so4的值10切换到3,因此代码在找到基金字符串时显示接下来的3行)。我怎样才能进一步解析呢?当前行Worksheets("helper").Cells(x, 6).Value = Split(TDElement.innerText, " ")(1)返回我需要的值(769.28),但如果我真的明白这里发生了什么,将来会有帮助。

所以,只是为了锻炼,我将如何获得所有3行打印在自己的细胞?因此,产出将为:基金规模(米尔),31/08/2017,769.28单元格(x,6),(x,7),(x,8)。当我尝试将函数“Split”或“Left”分配给TDElement时,函数只会定位最后一行,而不是上面的其他行。但是,当我Debug.Print TDElement.innerText/.innerHTML,我也看到其他线路。那么我怎样才能“访问”最后一行之上的行呢?

输出的Debug.Print TDElement.innerText

基金规模(MIL)

31/08/2017

EUR 769.28

回答

0

下面的代码将让你的 “基金规模” 行

Sub XMLhttpRequestTest2() 

'Microsoft XML, v 6.0 
'Microsoft HTML object library, used in parsing HTML 

    Dim myurl As String 
    Dim TDElement As Object 
    Dim TDElements As IHTMLElementCollection 
    Dim IE As MSXML2.XMLHTTP60 
    Dim Flag As Boolean 
    Dim HTMLDoc As MSHTML.HTMLDocument 
    Dim HTMLBody As MSHTML.HTMLBody 
    Dim k As Long 
    Set IE = New MSXML2.XMLHTTP60 
    Set HTMLDoc = New MSHTML.HTMLDocument 
    Set HTMLBody = HTMLDoc.body 


    myurl = "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW" 
    IE.Open "GET", myurl, False 
    IE.send 

    HTMLDoc.body.innerHTML = IE.responseText 
    Flag = False 
    k = 0 
    Set TDElements = HTMLDoc.getElementsByTagName("td") 
    For Each TDElement In TDElements 
     If InStr(TDElement.innerText, "Fund Size") <> 0 Or Flag Then 
      'if fundsize string is found, display the next 10 lines 
      Debug.Print ":" & TDElement.innerText '.innerText/.innerHTML. Can't see the fund size with .innerHTML?? 
      '  If "{line heading}" = "Fund Size (Mil)" Then 'How can I refer to headings in the html document? 
      '   Worksheets("Sheet3").Range("B3") = Split("{line text}", ";")(1) 'reference to line text? 
      k = k + 1 
      If k < 10 Then Flag = True Else Flag = False 
     End If 
    Next 


End Sub 
+0

非常感谢!这工作。我通过声明一个新的整数变量n并将其设置为3而不是10(仅针对我需要的行)来修改您的代码。然而,我想知道为什么我只能解析代码返回的最后一行。请参阅上面最初问题**的**扩展。 – Samppa

0

回答你的扩展,分割a的方法nswer取决于页面设计的方式。这里是一个可能的解决方案,让您的数据在3个不同的单元格中。

Sub XMLhttpRequestTest2() 

'Microsoft XML, v 6.0 
'Microsoft HTML object library, used in parsing HTML 

    Dim myurl As String 
    Dim TDElement As Object 
    Dim TDElements As IHTMLElementCollection 
    Dim IE As MSXML2.XMLHTTP60 
    Dim Flag As Boolean 
    Dim HTMLDoc As MSHTML.HTMLDocument 
    Dim HTMLBody As MSHTML.HTMLBody 
    Dim k As Long, text 
    Set IE = New MSXML2.XMLHTTP60 
    Set HTMLDoc = New MSHTML.HTMLDocument 
    Set HTMLBody = HTMLDoc.body 


    myurl = "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW" 
    IE.Open "GET", myurl, False 
    IE.send 

    HTMLDoc.body.innerHTML = IE.responseText 
    Flag = False 
    k = 0 
    Set TDElements = HTMLDoc.getElementsByTagName("td") 
    For Each TDElement In TDElements 
     If InStr(TDElement.innerText, "Fund Size") <> 0 Or Flag Then 
      'if fundsize string is found, display the next 10 lines 
      text = Split(TDElement.innerText, vbLf) 
      If text(0) <> "" Then 
       Worksheets("Sheet3").Cells(3, k + 2).Resize(, UBound(text) + 1) = text '.innerText/.innerHTML. Can't see the fund size with .innerHTML?? 
       '  If "{line heading}" = "Fund Size (Mil)" Then 'How can I refer to headings in the html document? 
       '   Worksheets("Sheet3").Range("B3") = Split("{line text}", ";")(1) 'reference to line text? 
      End If 
      k = k + 1 
      If k < 3 Then Flag = True Else Flag = False 
     End If 
    Next 
End Sub