2017-08-07 39 views
1

我已经写了一个VBA,它使用硒铬驱动程序打开一个Web链接来抓取数据,我得到了几个问题,我需要你们对你们的建议。VBA Selenium FindElementByXPath找不到元素

代码示例和结果1: 在错误actived

Sub test_supplements_store() 
    Dim driver As New ChromeDriver 
    Dim post As Object 

    i = 1 

    driver.Get "https://www.thesupplementstore.co.uk/brands/optimum_nutrition?page=4" 
On Error Resume Next 
    For Each post In driver.FindElementsByClass("desc") 
     Cells(i, 1) = post.FindElementByTag("a").Attribute("title") 
     Cells(i, 2) = Trim(Split(post.FindElementByClass("size").Text, ":")(1)) 
     Cells(i, 3) = post.FindElementByXPath(".//span[@class='now']//span[@class='pricetype-purchase-unit multi-price']//span[@class='blu-price blu-price-initialised']").Text 
     Cells(i, 4) = post.FindElementByTag("a").Attribute("href") 
     i = i + 1 
    Next post 
End Sub 

enter image description here

代码示例和结果2:在错误停用

Sub test_supplements_store() 
    Dim driver As New ChromeDriver 
    Dim post As Object 

    i = 1 

    driver.Get "https://www.thesupplementstore.co.uk/brands/optimum_nutrition?page=4" 
'On Error Resume Next 
    For Each post In driver.FindElementsByClass("desc") 
     Cells(i, 1) = post.FindElementByTag("a").Attribute("title") 
     Cells(i, 2) = Trim(Split(post.FindElementByClass("size").Text, ":")(1)) 
     Cells(i, 3) = post.FindElementByXPath(".//span[@class='now']//span[@class='pricetype-purchase-unit multi-price']//span[@class='blu-price blu-price-initialised']").Text 
     Cells(i, 4) = post.FindElementByTag("a").Attribute("href") 
     i = i + 1 
    Next post 
End Sub 

enter image description here 代码示例和结果3:在错误激活

Sub test_supplements_store() 
    Dim driver As New ChromeDriver 
    Dim post As Object 

    i = 1 

    driver.Get "https://www.thesupplementstore.co.uk/brands/optimum_nutrition" 
On Error Resume Next 
    For Each post In driver.FindElementsByClass("desc") 
     Cells(i, 1) = post.FindElementByTag("a").Attribute("title") 
     Cells(i, 2) = Trim(Split(post.FindElementByClass("size").Text, ":")(1)) 
     Cells(i, 3) = post.FindElementByXPath(".//span[@class='now']//span[@class='pricetype-purchase-unit multi-price']//span[@class='blu-price blu-price-initialised']").Text 
     Cells(i, 4) = post.FindElementByTag("a").Attribute("href") 
     i = i + 1 
    Next post 
End Sub 

enter image description here

第一个例子返回所有从该网站74项除了价格,但在很长的时间大约两分钟时间。

第二个示例仅将标题返回到工作表的第一个单元格并弹出错误。

第三个示例仅返回21,但错过了没有现在标签的商品的退货价格。脚本运行速度非常快,不到10秒。

请咨询如何将所有74个项目返回到标题,大小,价格,href。

+0

你得到了什么确切的错误? StaleElement? –

+0

我不确定你是什么意思,因为错误快照附加到第二个例子。第一个和第三个示例不会返回任何错误。 – Martin

+1

好的谢谢。我没有在VB上工作,但这是我用来克服java中过时的方法。 https://stackoverflow.com/questions/45434381/stale-object-reference-while-navigation-using-selenium/45435158#45435158 –

回答

1

您正在处理的页面已经应用了放置加载方法。这是因为所有项目一次不加载;相反,当您向下滚动时,它会加载其余部分。我在代码中使用了一个小的JavaScript函数,它解决了这个问题。我希望这是你所寻找的结果。

Sub test_supplements_store() 
    Dim driver As New ChromeDriver 
    Dim post As Object 

    driver.Get "https://www.thesupplementstore.co.uk/brands/optimum_nutrition" 
    On Error Resume Next 

    Do While EndofPage = False 
     PrevPageHeight = CurrentPageHeight 
     CurrentPageHeight = driver.ExecuteScript("window.scrollTo(0, document.body.scrollHeight);var CurrentPageHeight=document.body.scrollHeight;return CurrentPageHeight;") 
     driver.Wait 3000 
     If PrevPageHeight = CurrentPageHeight Then 
      EndofPage = True 
     End If 
    Loop 

    For Each post In driver.FindElementsByXPath("//li[contains(@class,'prod')]") 
     i = i + 1: Cells(i, 1) = post.FindElementByXPath(".//a").Attribute("title") 
     Cells(i, 2) = Split(post.FindElementByXPath(".//p[@class='size']").Text, ": ")(1) 
     Cells(i, 3) = post.FindElementByXPath(".//p[@class='price']//span[@class='now']//span|.//p[@class='price']//span[@class='dynamictype-single']").Text 
     Cells(i, 4) = post.FindElementByXPath(".//a").Attribute("href") 
    Next post 
End Sub 
+0

你有另一个我没有注意到的要求。使用xpath将解决价格问题。 – SIM

+0

不幸的是,您的代码只返回页面的第21项的价格。此外,我不确定如何一起返回该项目的正常和新价格。 – Martin

+0

我没有调整你的价格部分。我试图获得所有74个项目。 – SIM