2013-07-31 79 views
0

我想使用beautifulSoup从这个link得到学校名称,“珀金斯学院...”。从<a>元素获取文本?

我使用的代码不会返回任何内容。

school = soup.find('a','profiles-show-school-name-sm-link') 
print 'school: ', school 
print 'school.text: ', school.text 

输出:

school: <a class="profiles-show-school-name-sm-link" href="/profiles/show/online-degrees/stephen-f-austin-state-university/perkins-college-of-education-undergraduate/395/5401"> 
<img border="0" src="/images/profiles/243x60/4613/degrees/undergraduate-certificate-in-hospitality-administration.png"/> 
</a> 
school.text: 

建议一个BeautifulSoup实现提取校名(不是URL)?谢谢!

+0

您是否在寻找一个美丽的实现来提取学校名称?如果你想获得href,我相信school ['href']会起作用。 – sihrc

回答

1
school = soup.find('a','profiles-show-school-name-sm-link') 
url = school['href'] 

假设学校总是在url中的相同点:

for i in range(5): 
    url = url[url.find("/")+1:] 
schoolname = url[:url.find("/")] 
print " ".join(schoolname.split("-")).title() 

产量:

Perkins College Of Education Undergraduate 

获取大学

for i in range(4): 
    url = url[url.find("/")+1:] 
university= url[:url.find("/")] 
print " ".join(university.split("-")).title() 

产量:

Stephen F Austin State University 
+0

这实际上得到了大学内的学校。在
之后,你如何获得大学?谢谢! – goldisfine

+0


在哪里? – sihrc

+0

如果你去这个网址:http://www.geteducated.com/profiles/show/online-degrees/stephen-f-austin-state-university/perkins-college-of-education-undergraduate/395/5401/本科 - 证书在接待 - 管理 – goldisfine