没有输出与美丽的汤find_all

-1

import requests 
from bs4 import BeautifulSoup 

urla = 'https://www.tumblr.com/search/hello' 

r = requests.get(urla) 

soupa = BeautifulSoup(r.content, 'html.parser') 

links = soupa.find_all("div", {"class": "header_mage_wrapper has_avatar"}) 

for link in links: 

    print link

所以在我运行此代码后，没有任何中断，我得到退出代码0（使用PyCharm），但是根本没有输出。没有输出与美丽的汤find_all

如果我拿走{“class”：“header_mage_wrapper has_avatar”}）所以只有find_all（“div”），它工作得很好，并且拉出所有的div。我在一个不同的网站上尝试了这个代码，并且没有任何问题，我确信它是一个小的，我还不知道，我只用了一两天的BeautifulSoup，但是我找不到这是因为没有编码错误。

谢谢！

来源

2016-05-28 Sweetcheeks12354

在页面源代码中，您将链接指向我无法找到'header_mage_wrapper has_avatar'。你可以检查吗？ – minocha

你是怎么确定有这些类的div的？不是说有这样的类，但是'header_mage_wrapper'看起来像是拼写错误的'header_image_wrapper'。 –

这很可能是你试图解析javascript，需要做一些改变，正如Martijn所说 - 你可能没有正确拼写过类。 –

这将提取您没有输出：

import requests 
from bs4 import BeautifulSoup 
urla = 'https://www.tumblr.com/search/hello' 
r = requests.get(urla) 
soup = BeautifulSoup(r.text) 

for link in soup.find_all('div', class_="header_image_wrapper has_avatar"): 
    print(link.get('class'))

这是因为get()不取header_image_wrapper类。它提取的最低死者是search_blog_row。

您正在寻找的header_image_wrapper是根据您的搜索动态载入。

因此，您可以尝试POST，如here所示。

相反，我会建议使用Tumblr API来获得结果。

来源

2016-05-28 06:39:17

我认为你是正确的API。我将学习如何使用它。 – Sweetcheeks12354

@ Sweetcheeks12354很好。 –

没有输出与美丽的汤find_all

回答

相关问题