我该如何解析python中的这个xml字符串？

我的XML字符串 -我该如何解析python中的这个xml字符串？

xmlData = """<SMSResponse xmlns="http://example.com" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> 
      <Cancelled>false</Cancelled> 
      <MessageID>00000000-0000-0000-0000-000000000000</MessageID> 
      <Queued>false</Queued> 
      <SMSError>NoError</SMSError> 
      <SMSIncomingMessages i:nil="true"/> 
      <Sent>false</Sent> 
      <SentDateTime>0001-01-01T00:00:00</SentDateTime> 
      </SMSResponse>"""

我试图解析并获得标签的值 - 已取消，的MessageId，SMSError，等我使用python的Elementtree库。到目前为止，我已经尝试过的事情一样 -

root = ET.fromstring(xmlData) 
print root.find('Sent') // gives None 
for child in root: 
    print chil.find('MessageId') // also gives None

虽然，我能与打印标签 -

for child in root: 
    print child.tag 
    //child.tag for the tag Cancelled is - {http://example.com}Cancelled

和各自的价值 -

for child in root: 
    print child.text

我如何得到类似的东西 -

print child.Queued // will print false

赞在PHP中，我们可以用root访问它们 -

$xml = simplexml_load_string($data); 
$status = $xml->SMSError;

来源

2013-01-04 Hussain

您的文档上有一个命名空间，你需要搜索时包括命名空间：

root = ET.fromstring(xmlData) 
print root.find('{http://example.com}Sent',) 
print root.find('{http://example.com}MessageID')

输出：

<Element '{http://example.com}Sent' at 0x1043e0690> 
<Element '{http://example.com}MessageID' at 0x1043e0350>

find()和findall()方法也采用名称空间映射;你可以搜索任意前缀，并且前缀将在地图中查找，以节省打字：

nsmap = {'n': 'http://example.com'} 
print root.find('n:Sent', namespaces=nsmap) 
print root.find('n:MessageID', namespaces=nsmap)

来源

2013-01-04 09:14:13

所以基本上我每次要访问标签文本时都必须指定“{http://example.com}”？ – Hussain

@HussainTamboli：'find'和'findall'也有一个'namespaces = mapping'参数，但是当有一个默认名称空间时，这似乎没有用处。 'lxml'处理这一切好得多。 –

查看@ eclaird的回答。我想你也是这样做的。+1 – Hussain

您可以创建一个字典，并直接获取值出来吧......

tree = ET.fromstring(xmlData) 

root = {} 

for child in tree: 
    root[child.tag.split("}")[1]] = child.text 

print root["Queued"]

来源

2013-01-04 09:05:53 ATOzTOA

嗨，看我的编辑。 “//child.tag对于已取消标记为 - {http://example.com}已取消”，因此难以将其与“已取消”相匹配。有没有更好的方法？ – Hussain

更新回答，立即尝试... – ATOzTOA

嘿。它有效，但这只是一个调整。如何以标签是关键字而文本是值的方式访问标签的文本。 – Hussain

如果你在Python标准XML库设置，你可以使用这样的事情：

root = ET.fromstring(xmlData) 
namespace = 'http://example.com' 

def query(tree, nodename): 
    return tree.find('{{{ex}}}{nodename}'.format(ex=namespace, nodename=nodename)) 

queued = query(root, 'Queued') 
print queued.text

来源

2013-01-04 09:22:38 tuomur

这看起来不错。 – Hussain

随着lxml.etree：

In [8]: import lxml.etree as et 

In [9]: doc=et.fromstring(xmlData) 

In [10]: ns={'n':'http://example.com'} 

In [11]: doc.xpath('n:Queued/text()',namespaces=ns) 
Out[11]: ['false']

随着elementtree你可以这样做：

import xml.etree.ElementTree as ET  
root=ET.fromstring(xmlData)  
ns={'n':'http://example.com'} 
root.find('n:Queued',namespaces=ns).text 
Out[13]: 'false'

来源

2013-01-04 09:35:52 root

谢谢。我想知道在ElementTree中找到类似的东西。 +1 – Hussain

我该如何解析python中的这个xml字符串？

回答

相关问题