我需要一些帮助。我的输出看起来不对。我怎样才能正确追加dept,job_title,job_location的值。并且存在具有dept值的html标签。我如何删除这些标签。python append()并删除html标签
我的代码
response = requests.get("http://hortonworks.com/careers/open-positions/")
soup = BeautifulSoup(response.text, "html.parser")
jobs = []
div_main = soup.select("div#careers_list")
for div in div_main:
dept = div.find_all("h4", class_="department_title")
div_career = div. find_all("div", class_="career")
title = []
location = []
for dv in div_career:
job_title = dv.find("div", class_="title").get_text().strip()
title.append(job_title)
job_location = dv.find("div", class_="location").get_text().strip()
location.append(job_location)
job = {
"job_location": location,
"job_title": title,
"job_dept": dept
}
jobs.append(job)
pprint(jobs)
它应该看起来像
{ 'job_dept':咨询,
'job_location': '芝加哥,IL'
'JOB_TITLE':SR顾问 - 中央'
每个变量的1个值。
请出示你的输出... –
输出将显示,job_dept:所有部门,工作_location:所有位置,job_title:所有标题 –