2015-06-12 58 views
0

这是我的xml文件:如何合并XML文件中的两个不同路径?

<File> 
     <Paths> 
       <Path> 
        <Node> 
         <NodeName>Initial_Node</NodeName> 
         <InnerNode> 
         <Signal>Test_sig</Signal> 
         <InnerNode> 
          <Signal>Test_sig_1</Signal> 
          <NodeRef>Ref0</NodeRef> 
         </InnerNode> 
         </InnerNode> 
        </Node> 
       </Path> 
       <Path> 
        <Node> 
         <NodeName>Name1</NodeName> 
         <InnerNode> 
         <Signal>Test_sig_0</Signal> 
         <InnerNode> 
          <Signal>Test_sig_2</Signal> 
          <NodeRef>Ref1</NodeRef> 
         </InnerNode> 
         </InnerNode> 
        </Node> 
       </Path> 
     </Paths> 
     <Paths> 
       <Path> 
        <Node> 
         <NodeRef>Ref0</NodeRef> 
         <InnerNode> 
         <Signal>Test_sig_3</Signal> 
         <InnerNode> 
          <Signal>Test_sig_4</Signal> 
          <NodeName>Final_Node</NodeName> 
         </InnerNode> 
         </InnerNode> 
        </Node> 
       </Path> 
     </Paths> 
    </File> 

我使用Python中LXML。 我希望能够匹配<NodeRef>附着在上面的文件,然后合并这两个匹配的路径的其余部分一起得到以下结果:

<File> 
     <Paths> 
       <Path> 
        <Node> 
         <NodeName>Initial_Node</NodeName> 
         <InnerNode> 
         <Signal>Test_sig</Signal> 
          <InnerNode> 
           <Signal>Test_sig_1</Signal> 
            <InnerNode> 
             <Signal>Test_sig_3</Signal> 
             <InnerNode> 
              <Signal>Test_sig_4</Signal> 
              <NodeName>Final_Node</NodeName> 
             </InnerNode> 
            </InnerNode> 
          </InnerNode> 
         </InnerNode> 
        </Node> 
       </Path> 
       <Path> 
        <Node> 
         <NodeName>Name1</NodeName> 
         <InnerNode> 
         <Signal>Test_sig_0</Signal> 
         <InnerNode> 
          <Signal>Test_sig_2</Signal> 
          <NodeRef>Ref1</NodeRef> 
         </InnerNode> 
         </InnerNode> 
        </Node> 
       </Path> 
     </Paths> 
    </File> 

的帮助

回答

1

所以非常感谢在这里没有太多的细节,但这至少给出了正确的输出:

from lxml import etree 
root = etree.fromstring(xml) 

replace_set = {} 
for node in root.iter("Node"): 
    if 'NodeRef' in [c.tag for c in node]: 
     # This is a <Node> type with child element <NodeRef>. So it will 
     # be referenced by a <Node> with <NodeName>. Let's keep it, and then 
     # remove it from the tree. 
     ref = node.find("NodeRef").text 
     inner = node.find("InnerNode") 
     replace_set[ref] = inner 
     # Remove NodeRef element, as we've saved it in dict 
     node.getparent().remove(node) 

# Cleanup where we've removed NodeRefs. 
for node in root.iter("Paths"): 
    if len(node.find("Path")) == 0: 
     node.getparent().remove(node) 

# Replace references to NodeRefs 
for node in root.iter("NodeRef"): 
    if node.text in replace_set: 
     node.getparent().replace(node, replace_set[ref]) 

print etree.tostring(root) 
+0

非常干净的解决方案!谢谢! – Alessandro

相关问题