的Python HTML解析通过CSS选择

我试图收集来自以下明文/企业称号： <div class = "business-detail-text> <h1 class = "business-title" style="position:relative;" itemprop="name">H&H Construction Co.</h1>的Python HTML解析通过CSS选择

什么是做到这一点的最好方法是什么？ itemprop属性的风格&是我卡住的地方。我知道我可以使用soup.select，但目前我没有运气。

这里是我到目前为止的代码：

def bbb_profiles(profile_urls): 
    sauce_code = requests.get(profile_urls) 
    plain_text = sauce_code.text 
    soup = BeautifulSoup(plain_text, "html.parser") 
    for profile_info in soup.findAll("h1", {"class": "business-title"}): 
     print(profile_info.string)

来源

2016-01-03 n0de

是你需要什么？

>>> from bs4 import BeautifulSoup 
>>> txt='''<div class = "business-detail-text"> 
      <h1 class = "business-title" style="position:relative;" itemprop="name">H&H Construction Co.</h1></div>''' 
>>> soup = BeautifulSoup(txt, "html.parser") 
>>> soup.find_all('h1', 'business-title') 
[<h1 class="business-title" itemprop="name" style="position:relative;">H&amp;H; Construction Co.</h1>] 
>>> soup.find_all('h1', 'business-title')[0].text 
u'H&H; Construction Co.'

我看到企业详细文本和“之后”你的HTML缺少</DIV>在快结束的时候

来源

2016-01-03 18:00:10

我会尝试。谢谢！ – n0de

的Python HTML解析通过CSS选择

回答

相关问题