2012-03-08 141 views
1

我有麻烦解析XML当它在形式:解析嵌套的XML与LXML和Python

<Cars> 
    <Car> 
     <Color>Blue</Color> 
     <Make>Ford</Make> 
     <Model>Mustant</Model> 
    </Car> 
    <Car> 
     <Color>Red</Color> 
     <Make>Chevy</Make> 
     <Model>Camaro</Model> 
    </Car> 
</Cars> 

我想通了如何解析1级的孩子是这样的:

<Car> 
    <Color>Blue</Color> 
    <Make>Chevy</Make> 
    <Model>Camaro</Model> 
</Car> 

有了这样的代码:

from lxml import etree 
    a = os.path.join(localPath,file) 
    element = etree.parse(a) 
    cars = element.xpath('//Root/Foo/Bar/Car/node()[text()]') 
    parsedCars = [{field.tag: field.text for field in cars} for action in cars] 
    print parsedCars[0]['Make'] #Chevy 

我如何分析我们的多“车”的标签是“汽车总动员”的子标签?

回答

3

试试这个

from lxml import etree 
    a = os.path.join(localPath,file) 
    element = etree.parse(a) 
    cars = element.xpath('//Root/Foo/Bar/Car') 
    for car in cars: 
     colors = car.xpath('./Color') 
     makes = car.xpath('./Make') 
     models = car.xpath('./Model') 
+0

当我运行这段代码找到颜色我得到的地址,而不是实际的对象。例如,当试图查找颜色时,我得到[<元素颜色在0x2a9f0f8>] – lodkkx 2012-03-08 13:39:06

+0

它们返回元素对象。要获得文本使用xpath''./Color/text()'' – Dikei 2012-03-08 13:45:33

+0

是的,我实际上已经想出了它 - 但使用'./Color/node()'来代替。两者有什么不同 - 他们都给我的文字。 – lodkkx 2012-03-08 13:47:32