2014-02-25 22 views
2

想从this page如何使用VBA获取来自IMG的ALT值

获取价格表为此,我有以下代码:

一切工作正常只的的IMG ALT标签最后一列没有显示在。这段代码非常好,只有最后一列的类没有被抓取。

Sub TableExample() 

Dim IE As Object 
Dim doc As Object 
Dim strURL As String 

    If Range("B2").Value <> "NA" Then 
     strURL = "http://www.idealo.co.uk/compare/351072/canon-500d-77mm-close-up-lens.html" 
     Set IE = CreateObject("InternetExplorer.Application") 
     With IE 
      '.Visible = True 
      .navigate strURL 
      Do Until .readyState = 4: DoEvents: Loop 
      Do While .Busy: DoEvents: Loop 
      Set doc = IE.document 
      GetAllTables doc 
      .Quit 
     End With 
    End If 

End Sub 

Sub GetAllTables(doc As Object) 

    Dim ws As Worksheet 
    Dim rng As Range 
    Dim tbl As Object 
    Dim rw As Object 
    Dim cl As Object 
    Dim tabno As Long 
    Dim nextrow As Long 
    Dim i As Long 

    Set ws = Sheets("Sheet1") 

    For Each tbl In doc.getElementsByTagName("TABLE") 
     tabno = tabno + 1 
     nextrow = nextrow + 1 
     Set rng = ws.Range("B" & nextrow) 
     rng.Offset(, -1) = "Table " & tabno 
     On Error GoTo Err1: 
     If tabno = 10 Then 
      For Each rw In tbl.Rows 
       colno = 6 
       For Each cl In rw.Cells 
        If colno = 6 And nextrow > 10 Then 
         Set classColl = doc.getElementsByClassName("cellborder") 
         Set imgTgt = classColl(nextrow - 11).getElementsByTagName("img") 
         rng.Value = imgTgt(0).getAttribute("alt") 
        Else 
         rng.Value = cl.innerText 
        End If 
        Set rng = rng.Offset(, 1) 
        i = i + 1 
        colno = colno + 1 
       Next cl 
       nextrow = nextrow + 1 
       Set rng = rng.Offset(1, -i) 
       '  Call trim1 
       i = 0 
      Next rw 
      Exit Sub 
     End If 
    Next tbl 

Err1: 
'Call comp 
' ws.Cells.ClearFormats 
End Sub 
+1

它是'getElement ** s ByClassName',所以你的'classColl'应该是一个集合或类似的东西,所以你可能想尝试'classColl(0).getElementsByTagName(“img”)'。 – Passerby

+0

我做到了,但结果是一样的...没有变化 – user3305327

+0

当你在这里发布它时,请让你的代码易读。如果你没有正确缩进和写入,很难弄清楚你的代码是怎么回事。 :)我这次为你编辑它。 :) – Manhattan

回答

0

尝试为您GetAllTables这个子程序(很脏)变化:

Sub GetAllTables(doc As Object) 

    Dim ws As Worksheet 
    Dim rng As Range 
    Dim tbl As Object 
    Dim rw As Object 
    Dim cl As Object 
    Dim tabno As Long 
    Dim nextrow As Long 
    Dim i As Long 

    Set ws = Sheets("Sheet1") 

    'Improvised way of getting images. 
    Dim imagesColl As New Collection 
    Set imgColl = doc.getElementsByClassName("noborder") 
    For Each imgElem In imgColl 
     If imgElem.getAttribute("height") = 30 And imgElem.getAttribute("width") = 80 Then 
      imagesColl.Add imgElem.getAttribute("alt") 
     End If 
    Next imgElem 

    For Each tbl In doc.getElementsByTagName("table") 
     tabno = tabno + 1 
     If tabno = 10 Then 
      nextrow = 1 
      imgIter = 1 
      For Each rw In tbl.Rows 
       colno = 1 
       For Each cl In rw.Cells 
        Set rng = ws.Cells(nextrow, colno) 
        If colno = 5 Then 
         rng.Value = imagesColl.Item(imgIter) 
         imgIter = imgIter + 1 
        Else 
         rng.Value = cl.innerText 
        End If 
        colno = colno + 1 
       Next cl 
       nextrow = nextrow + 1 
      Next rw 
      Exit Sub 
     End If 
    Next tbl 

End Sub 

事情是,你真的没有做表格样式。如果您知道要定位哪些元素,则为DOM之外的数据创建集合(即,使用正常的 VBA集合)要好得多。

无论如何,以上是尝试和测试。让我们知道这是否有帮助。

+0

你真的是一个天才伙计...感谢您的启发我! – user3305327

0

所有您需要做的是指定要查找哪个ClassColl的图像。

试试这个:

Set classColl = doc.getElementsByClassName("cellborder") 
Set imgTgt = classColl(0).getElementsByTagName("img") 
Rng.Value = imgTgt(0).getAttribute("alt") 
+0

尝试过但没有区别...我想我应该与你分享整个代码...编辑问题,请检查一次 – user3305327