2016-11-24 97 views
2

我是XML解析的新手。我已阅读关于DOM和SAX解析器,并尝试了几个示例实现。然而,我无法解析以下XML数据解析XML并获取标签内的值属性值

<?xml version="1.0" ?> 
<collection> 
<action value="submit"/> 
<protocol_version value="1"/> 
<reponse value="Success"/> 
<batch> 
    <sample> 
     <count value="1"/> 
     <count2 value="2"/> 
     <count3 value="3"/> 
    </sample> 
    <sample_2> 
     <date value="10/10/2010"/> 
     <page value="SampleData"/> 
     <track value="123123123"/> 
     <same value="1.00"/> 
     <data> 
      <first_name value="Jeffrey"/> 
      <SSID value="1231231231"/> 
      <last_name value="Chuckle"/> 
      <field1 value="123123123"/> 
      <field2 value="Sam E. Bonzella"/> 
      <field3 value="SOME VALUE"/> 
      <field4 value="SOME VALUE 2"/> 
      <field5 value="TEXT"/> 
      <field6 value="12312"/> 
     </data> 
    </sample_2> 
</batch> 
</collection> 

下面是示例代码我试图实现,但它需要但却难免重复代码,同时也中,数据是没有组织。我也尝试过JAXB解析器,但无法获取值属性。

public class test { 
public static void main(String[] args){ 

    try { 
     File inputFile = new File("staff.xml"); 
     DocumentBuilderFactory dbFactory 
       = DocumentBuilderFactory.newInstance(); 
     DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); 
     Document doc = dBuilder.parse(inputFile); 
     doc.getDocumentElement().normalize(); 
     System.out.println("Base :" 
       + doc.getDocumentElement().getNodeName()); 
     NodeList nList = doc.getElementsByTagName("action"); 
     for (int temp = 0; temp < nList.getLength(); temp++) { 
      Node nNode = nList.item(temp); 
      System.out.println("Element :" 
        + nNode.getNodeName()); 
      if (nNode.getNodeType() == Node.ELEMENT_NODE) { 
       Element eElement = (Element) nNode; 
       System.out.println("Action : " 
         + eElement.getAttribute("value")); 
      } 
     } 
     nList = doc.getElementsByTagName("transaction_count"); 
     for (int temp = 0; temp < nList.getLength(); temp++) { 
      Node nNode = nList.item(temp); 
      System.out.println("Element :" 
        + nNode.getNodeName()); 
      if (nNode.getNodeType() == Node.ELEMENT_NODE) { 
       Element eElement = (Element) nNode; 
       System.out.println("transaction_count : " 
         + eElement.getAttribute("value")); 
      } 
     } 


    } catch (Exception e) { 
     e.printStackTrace(); 
    } 
} 
} 

理想情况下,我希望将数据解析为数组或可能是Map。

回答

3

getElementsByTagName(String name)在这种情况下无用,因为应该提供所有标记名称。上述

XML包含可以分为两类元素:

  1. 元素与值 - 如果我理解正确的问题,标记名和值应该存储在地图

  2. 元素没有值。它们包含另一个元素。标记名不应该被存储。

元素可以递归解析。如果元素包含属性“值”,那么它应该存储在地图中。否则,应该检查该元素的子节点。

public static void main(String argv[]) { 

    Map<String, String> map = new LinkedHashMap<>(); 

    try { 
     File fXmlFile = new File("staff.xml"); 
     DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); 
     DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); 
     Document doc = dBuilder.parse(fXmlFile); 
     doc.getDocumentElement().normalize(); 

     NodeList collectionNodeList = doc.getElementsByTagName("collection"); 
     Element collectionElement = (Element) collectionNodeList.item(0); 
     findElementsWithValues(map, collectionElement); 

    } catch (Exception e) { 
     e.printStackTrace(); 
    } 

    System.out.println("Found values: " + map.size()); 
    System.out.println(map); 
} 

private static void findElementsWithValues(Map<String, String> map, Element rootElement) { 
    NodeList childNodes = rootElement.getChildNodes(); 
    for (int i = 0; i < childNodes.getLength(); i++) { 
     Node node = childNodes.item(i); 
     if (node.getNodeType() == Node.ELEMENT_NODE) { 
      Element element = (Element) node; 
      String value = element.getAttribute("value"); 
      if (!value.isEmpty()) { 
       String tagName = element.getTagName(); 
       map.put(tagName, value); 
      }else{ 
       findElementsWithValues(map, element); 
      } 
     } 
    } 
} 

输出(在上面的XML文件的更正后,使其可解析)

Found values: 19 
{action=submit, protocol_version=1, reponse=Success, count=1, count2=2, count3=3, date=10/10/2010, page=SampleData, track=123123123, same=1.00, first_name=Jeffrey, SSID=1231231231, last_name=Chuckle, field1=123123123, field2=Sam E. Bonzella, field3=SOME VALUE, field4=SOME VALUE 2, field5=TEXT, field6=12312}