2017-10-09 49 views
0

如何从下面的xml文件中提取听众的数量<listeners>10</listeners>,我的代码不工作。从xml中使用python bs4和lxml提取值

import bs4 
import urllib2 
import lxml 
bs4.BeautifulSoup(urllib2.urlopen('http://admin:[email protected]:8382/admin/').read(), 'lxml')  
SERVER = 'http://192.168.0.31:8382/admin/' 
authinfo = urllib2.HTTPPasswordMgrWithDefaultRealm() 
authinfo.add_password(None, SERVER, 'admin', 'mypassword') 
page = 'http://192.168.0.31:8382/admin/' 
handler = urllib2.HTTPBasicAuthHandler(authinfo) 
myopener = urllib2.build_opener(handler) 
opened = urllib2.install_opener(myopener) 
output = urllib2.urlopen(page) 
print output.read() 
soup = bs4.BeautifulSoup(output.read(), 'lxml') 
print soup.find('listeners') 

和XML是如下

<icestats> 
<admin>[email protected]</admin> 
<banned_IPs>0</banned_IPs> 
<build>20140902200316</build> 
<client_connections>289</client_connections> 
<clients>2</clients> 
<connections>291</connections> 
<file_connections>13</file_connections> 
<host>localhost</host> 
<listener_connections>0</listener_connections> 
<listeners>10</listeners> 
<location>Earth</location> 
<outgoing_kbitrate>0</outgoing_kbitrate> 
<server_id>Icecast 2.3.3-kh11</server_id> 
<server_start>08/Oct/2017:08:43:08 +1100</server_start> 
<source_client_connections>1</source_client_connections> 
<source_relay_connections>0</source_relay_connections> 
<source_total_connections>1</source_total_connections> 
<sources>1</sources> 
<stats>0</stats> 
<stats_connections>0</stats_connections> 
<stream_kbytes_read>185119</stream_kbytes_read> 
<stream_kbytes_sent>0</stream_kbytes_sent> 
<source mount="/listen.mp3"> 
<audio_codecid>2</audio_codecid> 
<audio_info>bitrate=60</audio_info> 
<bitrate>60</bitrate> 
<connected>42056</connected> 
<genre>Islam</genre> 
<incoming_bitrate>35976</incoming_bitrate> 
<listener_connections>0</listener_connections> 
<listener_peak>0</listener_peak> 
<listeners>0</listeners> 
<listenurl>http://localhost:8382/listen.mp3</listenurl> 
<max_listeners>unlimited</max_listeners> 
<mpeg_channels>2</mpeg_channels> 
<mpeg_samplerate>22050</mpeg_samplerate> 
<outgoing_kbitrate>0</outgoing_kbitrate> 
<public>1</public> 
<queue_size>64523</queue_size> 
<server_description>Quran Kareem Radio</server_description> 
<server_name>Quran Kareem Radio</server_name> 
<server_type>audio/mpeg</server_type> 
<server_url>http://qkradio.com.au</server_url> 
<slow_listeners>0</slow_listeners> 
<source_ip>139.218.241.112</source_ip> 
<stream_start>08/Oct/2017:08:43:16 +1100</stream_start> 
<total_bytes_read>189563392</total_bytes_read> 
<total_bytes_sent>0</total_bytes_sent> 
<total_mbytes_sent>0</total_mbytes_sent> 
<user_agent>instreamer</user_agent> 
</source> 
</icestats> 

回答

0

使用本:

soup = BeautifulSoup(output.read()) 
soup.select('listeners') 
[<listeners>10</listeners>, <listeners>0</listeners>] 
+0

我得到的是[]? – Ossama

+0

请检查你的汤对象,它是否包含任何数据?因为这应该工作 –

+0

使用打印output.read()在上面的代码打印xml文件 – Ossama

1

试试这个:

soup = BeautifulSoup(output.read(), 'xml') 
for value in soup.find_all('listeners'): 
    print(value.get_text()) 
+0

这将如何与多个'listeners'项目一起工作?你能扩展你的代码吗? – Risadinha

+0

对不起,当然你需要使用find_all。像这样:'soup = BeautifulSoup(page,'xml') for soup.find_all('listeners'): print(value.get_text())' – komito

+0

我更正了我的答案 – komito