读取XML文件，并获取在Python

它的属性值，我有这样的XML文件读取XML文件，并获取在Python

<domain type='kmc' id='007'> 
    <name>virtual bug</name> 
    <uuid>66523dfdf555dfd</uuid> 
    <os> 
    <type arch='xintel' machine='ubuntu'>hvm</type> 
    <boot dev='hd'/> 
    <boot dev='cdrom'/> 
    </os> 
    <memory unit='KiB'>524288</memory> 
    <currentMemory unit='KiB'>270336</currentMemory> 
    <vcpu placement='static'>10</vcpu>

现在我想分析这个并获取其属性值。例如，我想获取uuid字段。那么在Python中获取它的正确方法是什么？

来源

2012-09-05 S.Ali

你有什么试过的？使用Google搜索“python xml”会产生相当多的真正有用的结果，这些结果应该指向正确的方向。 – Blender

有很多例子，但没有指出我想去的方向。我想获取属性值。我看到的例子是转换为XML文件或转换形式的XML文件 –

下面是一个lxml片段提取的属性以及元素文本（你的问题有点含糊不清哪一个你需要的，所以我既包括）：

from lxml import etree 
doc = etree.parse(filename) 

memoryElem = doc.find('memory') 
print memoryElem.text  # element text 
print memoryElem.get('unit') # attribute

你问（在评论阿里Afshar的答案） r minidom（2.x，3.x）是一个很好的选择。这是使用minidom的等效代码;为自己判断哪个更好：

import xml.dom.minidom as minidom 
doc = minidom.parse(filename) 

memoryElem = doc.getElementsByTagName('memory')[0] 
print ''.join([node.data for node in memoryElem.childNodes]) 
print memoryElem.getAttribute('unit')

lxml看起来像是我的赢家。

来源

2012-09-06 04:55:27

这个方法也与Python 2和3包含的['xml.etree.ElementTree']（https://docs.python.org/library/xml.etree.elementtree.html）兼容。 –

etree，与lxml可能：

root = etree.XML(MY_XML) 
uuid = root.find('uuid') 
print uuid.text

来源

2012-09-05 21:34:50

不会是minidom是一个不错的选择。你怎么看 –

我会用LXML和使用xpath //UUID

来源

2012-09-05 21:35:42

其他人可以告诉你如何与Python标准库做解析出来。我推荐我自己的小型图书馆，这使得这是一个非常简单的过程。

>>> obj = xml2obj.xml2obj("""<domain type='kmc' id='007'> 
... <name>virtual bug</name> 
... <uuid>66523dfdf555dfd</uuid> 
... <os> 
... <type arch='xintel' machine='ubuntu'>hvm</type> 
... <boot dev='hd'/> 
... <boot dev='cdrom'/> 
... </os> 
... <memory unit='KiB'>524288</memory> 
... <currentMemory unit='KiB'>270336</currentMemory> 
... <vcpu placement='static'>10</vcpu> 
... </domain>""") 
>>> obj.uuid 
u'66523dfdf555dfd'

http://code.activestate.com/recipes/534109-xml-to-python-data-structure/

来源

2012-09-05 21:38:09

XML

<data> 
    <items> 
     <item name="item1">item1</item> 
     <item name="item2">item2</item> 
     <item name="item3">item3</item> 
     <item name="item4">item4</item> 
    </items> 
</data>

的Python：

from xml.dom import minidom 
xmldoc = minidom.parse('items.xml') 
itemlist = xmldoc.getElementsByTagName('item') 
print "Len : ", len(itemlist) 
print "Attribute Name : ", itemlist[0].attributes['name'].value 
print "Text : ", itemlist[0].firstChild.nodeValue 
for s in itemlist : 
    print "Attribute Name : ", s.attributes['name'].value 
    print "Text : ", s.firstChild.nodeValue

来源

2013-07-12 19:54:27

上面的XML没有关闭标签，它会给

etree parse error: Premature end of data in tag

正确的XML是：

<domain type='kmc' id='007'> 
    <name>virtual bug</name> 
    <uuid>66523dfdf555dfd</uuid> 
    <os> 
    <type arch='xintel' machine='ubuntu'>hvm</type> 
    <boot dev='hd'/> 
    <boot dev='cdrom'/> 
    </os> 
    <memory unit='KiB'>524288</memory> 
    <currentMemory unit='KiB'>270336</currentMemory> 
    <vcpu placement='static'>10</vcpu> 
</domain>

来源

2017-12-21 06:40:37

读取XML文件，并获取在Python

回答

相关问题