2013-02-06 70 views
2

我想用美丽的汤刮网站。我可以导航到类对象,但可以进入下一级以获取我想要的文本。网页浏览美丽的汤 - 导航

到目前为止我有

soup = BeautifulSoup(urllib2.urlopen('URL...').read()) 

comment = soup('div', {'class' : 'PanelDarkBackground'}) 
print comment 

刚刚输出的整个类(下文)。我只想提取0-0,这是在tr> td id =“event”部分的代码

有什么建议...?

[<div class="PanelDarkBackground" id="Event-Basic-Info" style="margin-bottom: 10px"> 
<div style="height: 70px; width: 100%;"> 
<div style="height: 70px; width: 70px; float: left; background-color: white"> 
<img height="70" src="ss" width="70"/> 
</div> 
<div style="width: 450px; float: left; height: 70px; display: table"> 
<table border="0" cellpadding="0" cellspacing="0" style="font-family: tahoma; font-size:  18pt; font-weight: bold; color: white;" width="450px"> 

    <tr> 
     <td align="center" height="70" style="font-family: tahoma; font-size: 18pt; font-weight: bold; color: white;" valign="middle" width="197">seveal</td> 
     <td align="center" id="event" style="font-family: tahoma; font-size: 18pt; font- weight: bold; color: white;" valign="middle">0-0</td> 
     <td align="center" style="font-family: tahoma; font-size: 18pt; font-weight: bold; color: white;" valign="middle" width="197">seveal</td> 
    </tr> 
</table> 
</div> 
<div style="height: 70px; width: 70px; float: right; background-color: white"> 
<img height="70" src="" width="70"/> 
</div> 
</div> 
</div>] 
+0

为什么不只是搜索给ID?它应该在每一页都是独一无二的 – Vor

回答

2

直接转到td

print soup('td',{'id':'event'}) 

对于刚刚td你可以做的内容:

print soup('td',{'id':'event'})[0].contents[0] 
+0

谢谢!!!这有效,但它得到了整条线,我只想要0-0 ...? – DavidJB