为什么我不能在美丽的汤中找到这个标签？

即时通讯新的python，或任何comp语言的事情，但即时通讯尝试刮从使用此代码的网站标题，但它一直打印“无”，如果标题，或任何标签，如果我取代它，doesn不存在。为什么我不能在美丽的汤中找到这个标签？

import bs4 
from urllib.request import urlopen as uReq 
from bs4 import BeautifulSoup as soup 


my_url = "https://www.roblox.com/catalog/?CatalogContext=1&Keyword=the%20item&SortAggregation=5&LegendExpanded=true&Category=2" 
uClient = uReq(my_url) 
page_html = uClient.read() 
uClient.close() 
page_soup = soup(page_html, "html.parser") 

ttt = page_soup.find("div", {"class":"CatalogItemName notranslate"}) 
item = ttt.a.text 
print(item)

来源

2017-06-21 drew p

您正在查找的内容不在从服务器收到的http响应中。它在页面加载后由javascript生成。

在执行抓取任务时，您应该始终在浏览器中加载网站而不使用JavaScript，以便更好地了解原始html内容的样子。

最后，您可以通过使用像selenium这样的JavaScript支持的抓取工具解决此问题。

来源

2017-06-21 07:08:47 VMRuiz

当你想找到使用多个类的元素，我认为以下是约定。

soup.find("div", {'class':['CatalogItemName', 'notranslate']})

来源

2017-06-21 07:07:16

-1

如果你想在HTML页面的标题试试这个

import urllib.request 
from bs4 import BeautifulSoup 
import pandas as pd 

url = "https://www.roblox.com/catalog/?CatalogContext=1&Keyword=the%20item&SortAggregation=5&LegendExpanded=true&Category=2"; 
page = urllib.request.urlopen(url); 

soup = BeautifulSoup(page, 'html.parser'); 


print(soup.title)

来源

2017-06-21 07:57:31

这不回答这个问的问题。他正在将页面内容加入到BeautifulSoup中... – Baldrickk

@Baldrickk他说他试图从网站上刮掉一个标题，但无法做到这一点 –

他正试图获得“一个标题”，而不是页面标题。 – VMRuiz

为什么我不能在美丽的汤中找到这个标签？

回答

相关问题