我使用Python 3.5,并试图刮URL列表(同一网站)的列表,代码如下:刮网址
import urllib.request
from bs4 import BeautifulSoup
url_list = ['URL1',
'URL2','URL3]
def soup():
for url in url_list:
sauce = urllib.request.urlopen(url)
for things in sauce:
soup_maker = BeautifulSoup(things, 'html.parser')
return soup_maker
# Scraping
def getPropNames():
for propName in soup.findAll('div', class_="property-cta"):
for h1 in propName.findAll('h1'):
print(h1.text)
def getPrice():
for price in soup.findAll('p', class_="room-price"):
print(price.text)
def getRoom():
for theRoom in soup.findAll('div', class_="featured-item-inner"):
for h5 in theRoom.findAll('h5'):
print(h5.text)
for soups in soup():
getPropNames()
getPrice()
getRoom()
到目前为止,如果我打印的汤,让propNames, getPrice或getRoom他们似乎工作。但我似乎无法通过每个URL并打印getPropNames,getPrice和getRoom。
只有在几个月的时间里才学习Python,所以非常感谢您的帮助!
谢谢SebastianOpałczyński,我会把它放在船上,试着让我的头靠近它,让你知道结果! – Maverick