2013-01-04 48 views
0

我有一个像下面的XML文件:Python的XML解析 - multisections

<item> 
    <global> 

    <option id="123b25-1323-2f"> 
     <name>Bla</name> 
     <number></number> 
    </option> 
    <option id="aeb12f-91b3-57"> 
     <name>Foo</name> 
     <number>92309</number> 
    </option> 

    <section id="aeee72-0965-66"> 
     <name>alb</name> 
     <number></number> 
    </section> 
    <section id="928374-11b3-51"> 
     <name>oof</name> 
     <number>92309</number> 
    </section> 

    </global> 
</item> 

什么是构建一个字典FO例如最好的办法选项和部分与python2.7和一个适当的模块。
示例代码:

root = XMLTree(xml) # xml ist file or string 
global = root.getSubsection('global') 
options = global.getItems('option') 
sections = global.getItems('section') 

print options 

我希望输出这样的:

=> {'id-123b25-1323-2f': {'name': 'Bla', 'number': ''}, 'id-aeb12f-91b3-57': {'name': 'Foo', 'number': '92309'}} 

回答

1

ElementTree是非常合适的标准库模块。这里有一个建议(Python 2.7版):

from xml.etree import ElementTree as ET 

def get_items(elements): 
    D = {elem.get("id"): dict((child.tag, child.text) for child in elem) 
     for elem in elements} 
    return D 

tree = ET.parse("item.xml") 
options = tree.findall(".//option") 
sections = tree.findall(".//section") 

print "options:" 
print get_items(options) 
print "sections:" 
print get_items(sections) 

输出:

options: 
{'aeb12f-91b3-57': {'name': 'Foo', 'number': '92309'}, '123b25-1323-2f': {'name': 'Bla', 'number': ''}} 
sections: 
{'928374-11b3-51': {'name': 'oof', 'number': '92309'}, 'aeee72-0965-66': {'name': 'alb', 'number': ''}} 
+0

,看起来完全像什么即时寻找。我目前无法尝试,但提前致谢 – HappyHacking

1

您可以使用xml.dom.minidom解析xml字符串并提取元素以创建字典。下面是minidom命名一个例子

from xml.dom.minidom import parseString 
dom = parseString(data) #xml dom object from xml 
def getItems(node): 
    """dom parser and xml generator""" 
    return {node.getAttribute('id'): 
       dict((e.nodeName, e.firstChild.data) 
         for e in node.childNodes if e.nodeType == dom.ELEMENT_NODE) 
      for node in node } 

options = dom.getElementsByTagName('option') 
sections = dom.getElementsByTagName('section') 
getItems(options) 
{u'aeb12f-91b3-57': {u'name': u'Foo', u'number': u'92309'}, u'123b25-1323-2f': {u'name': u'Bla', u'number': u''}} 
getItems(sections) 
{u'928374-11b3-51': {u'name': u'oof', u'number': u'92309'}, u'aeee72-0965-66': {u'name': u'alb', u'number': u''}} 
1
import lxml.etree as et 

doc=et.fromstring(xml) 

def getItems(doc,name): 
    d={} 
    for elem in doc.xpath('.//{0}'.format(name)): 
     attr=elem.xpath('.//@id')[0] 
     items=[(i.tag, i.text) for i in elem.xpath('.//*')] 
     d[attr]={k:v for k,v in items} 
    return d 
print getItems(doc,'option') 
print getItems(doc,'section') 

输出:

{'aeb12f-91b3-57': {'name': 'Foo', 'number': '92309'}, '123b25-1323-2f': {'name': 'Bla', 'number': ''}} 
{'928374-11b3-51': {'name': 'oof', 'number': '92309'}, 'aeee72-0965-66': {'name': 'alb', 'number': ''}}