2011-03-29 77 views
1

我试图解析XML文档,使用lxml objectify和xpath提取数据。下面是该文件的一个剪断:Python lxml(objectify):Xpath麻烦

<?xml version="1.0" encoding="UTF-8"?> 
<Assets> 
<asset name="Adham"> 
    <pos> 
    <x>27913.769923</x> 
    <y>5174.627773</y> 
    </pos> 
    <description>Ba bla bla</description> 
    <bar>(null)</bar> 
    </general> 
</asset> 
<asset name="Adrian"> 
    <pos> 
    <x>-179.477707</x> 
    <y>5286.959359</y> 
    </pos> 
    <commodities/> 
    <description>test test test</description> 
    <bar>more bla</bar> 
    </general> 
</asset> 
</Assets> 

我有以下方法:

def getALLattributesX(self, _root): 
    '''Uses getattributeX and parses through the attribute dict, assigning 
    values as it goes. _root is the main document root''' 
    for k in self.attrib: 
     self.getattributeX(_root, self.attribPaths[k], k) 

...调用该方法:

def getattributeX(self, node, x_path, _attrib): 
    '''Gets a value from an xml node indicated by an xpath 
    and assigns it to a the appropriate. If node does not exists 
    it assigns "error" 
    ''' 

    print node.xpath(x_path)[0].text 
    try: 
     self.attrib[_attrib] = node.xpath(x_path) 
    except KeyError: 
     self.misload = True 
    #except AttributeError: 
     # self.attrib[attrib] = "error loading " + attrib 
     #self.misload = True 

print语句是从测试。当我执行第一个方法时,它通过xml文档解析,成功停止每个资产对象。我必须为它找到的变量的字典,并为它使用路径免费字典,如下定义:

class tAssetList: 

    alist = {} #dict of assets 
    tlist = [] 
    tree = None # XML tree 
    root = None #root elem 

    def readXML(self, _filename): 
     #Load file 
     fileobject = open(_filename, "r") #read-only 
     self.tree = objectify.parse(fileobject) 
     self.root = self.tree.getroot() 

     for elem in self.root.asset: 
      temp_asset = tAsset() 
      a_name = elem.get("name") # get name, which is the key for dict 
      temp_asset.getALLattributesX(elem) 
      self.alist[a_name] = temp_asset 


class tAsset(obs.nxObject): 
    def __init__(self): 
     self.attrib = {"X_pos" : None, "Y_pos" : None} 
     self.attribPaths = {"X_pos" : '/pos/x', "Y_pos" : '/pos/y'} 

然而,XPath的似乎并不奏效时,我把它叫做节点上(这是一个客观的XML节点)。它只是输出[],如果我直接将其等同,并且如果我尝试:[0] .text,它会给索引超出范围错误。

这是怎么回事?

回答

4

/pos/x/pos/y是绝对的XPath表达式,它们不选择任何元素,因为提供的XML文档没有pos顶层元素。

尝试

pos/x 

pos/y 
+0

+1绝对和相对表现之间的正确区分。 – 2011-03-29 18:43:45

+0

我认为这可能与此有关,但我不确定其中的差异。它工作得很好,谢谢! – Biosci3c 2011-03-29 19:51:30

+0

@ Biosci3c:不客气。 – 2011-03-29 21:07:22