python
  • xml
  • lxml
  • 2013-04-12 50 views 0 likes 
    0

    我想使用lxml删除XML元素,方法似乎没问题,但它不工作。这就是我的代码:lxml删除元素不工作

    import lxml.etree as le 
    f = open('Bird.rdf','r') 
    doc=le.parse(f) 
    for elem in doc.xpath("//*[local-name() = 'dc' and namespace-uri() = 'http://purl.org/dc/terms/']"): 
        parent=elem.getparent().remove(elem) 
    print(le.tostring(doc)) 
    

    示例XML文件:

    <rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/terms/"> 
    
         <wo:Class rdf:about="/nature/life/Bird#class"> 
            <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a 
             covering of feathers, and their front limbs are modified into wings. Some birds, such as 
             penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds 
             are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or 
             they will perish</dc:description> 
         </wo:Class> 
    </rdf:RDF>     
    

    回答

    4

    您的问题是本地的名字是 '描述',而不是 'DC'(命名空间的别名)。您可以将您的命名空间中的XPath功能,更直接地写你的XPath为:

    import lxml.etree as le 
    
    txt="""<rdf:RDF xmlns:rdf="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/terms/" 
        xmlns:wo="http:/some/wo/namespace"> 
    
        <wo:Class rdf:about="/nature/life/Bird#class"> 
         <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a 
             covering of feathers, and their front limbs are modified into wings. Some birds, such as 
             penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds 
             are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or 
             they will perish</dc:description> 
        </wo:Class> 
    </rdf:RDF> 
    """ 
    
    namespaces = { 
        "rdf":"http://www.w3.org/2000/01/rdf-schema#", 
        "dc":"http://purl.org/dc/terms/", 
        "wo":"http:/some/wo/namespace" } 
    
    doc=le.fromstring(txt) 
    for elem in doc.xpath("//dc:description", namespaces=namespaces): 
        parent=elem.getparent().remove(elem) 
    print(le.tostring(doc)) 
    
    +3

    或'的XPath(...,命名空间= doc.getroot()nsmap)',节省打字 – mata

    +0

    @mata我不知道nsmap,感谢提示! – tdelaney

    相关问题