2017-08-26 82 views
0

我想从一系列xml数据中提取数字。用Python从xml中提取数据3

xml数据是这样的:

<commentinfo> 
    <note>This file contains the sample data for testing</note> 
    <comments> 
    <comment> 
     <name>Romina</name> 
     <count>97</count> 
</comment> 

等都与新的名称和注释。

我的代码是:

import urllib.request, urllib.parse, urllib.error 
import xml.etree.ElementTree as ET 

url = 'http://py4e-data.dr-chuck.net/comments_42.xml' 

uh = urllib.request.urlopen(url) 
data = uh.read() 
# print(data) 

tree = ET.fromstring(data) 
# print('Name:',tree.find('count').text) 
lst = tree.findall('comments/comment/count') 
# print(len(lst)) 
# print(lst) 
# x1 = result[1].find('comment') 

# for item in lst: 
#  print('Count', item.find('count').text) 

counts = tree.findall('.//count') 
print(counts) 

当我打印counts我得到一个更长的版本:

<Element 'count' at 0x000000000A09FB88>, <Element 'count' at 0x000000000A09FC78>, <Element 'count' at 0x000000000A09FD68>, <Element 'count' at 0x000000000A09FE58>, <Element 'count' at 0x000000000A09FF48>, <Element 'count' at 0x000000000A0A3098>] 

我很新的这一点,所以我不明白为什么我收到这些十六进制数字,我也不知道如何提取实际数字。

我希望有人可以帮忙。

+0

'counts.text'是否满足您的要求? –

回答

1

只是遍历列表并打印每个元素的文本。

import urllib.request, urllib.parse, urllib.error 
import xml.etree.ElementTree as ET 

url = 'http://py4e-data.dr-chuck.net/comments_42.xml' 

uh = urllib.request.urlopen(url) 
data = uh.read() 

tree = ET.fromstring(data) 

lst = tree.findall('comments/comment/count') 

counts = tree.findall('.//count') 

for each in counts: 
    print(each.text) 
+0

太棒了!谢谢。 –