2011-10-08 110 views
0

我试图从/ r/Askreddit获取线程标题。下面的代码返回None而不是线程标题。BeautifulSoup问题并从findAll函数打印字符串

from BeautifulSoup import BeautifulSoup 
import urllib2, json 

site='http://www.reddit.com/r/AskReddit/' 

soup=BeautifulSoup(urllib2.urlopen(site)) 

questions=soup.findAll('p',{"class":"title"}) 


for i in questions: 
     print i.string 
     break 

回答

1

标题是在a标签,而不是p标签的string属性。 另外,注意空间title后:

<p class="title"><a class="title " href="http://www.reddit.com/r/AskReddit/comments/l5157/whats_the_best_face_you_can_pull_before_and_after/">What's the best face you can pull? Before and after please.</a> <span class="domain">(<a href="http://www.reddit.com/r/AskReddit/">self.AskReddit</a>)</span></p> 

questions=soup.findAll('a',{"class":"title "}) 

以上通过查看这个HTML片段中