2011-06-27 73 views
0

所以我今天在这里问了几个问题,在SO社区的帮助下,我已经能够理解SAX的基础知识,以及如何正确遍历目录结构。JAVA SAX解析错误

现在,通过我的程序,我可以访问我正在查找的XML文件,但我不确定这个错误是否意味着我的SAXHandler类的代码中出现了错误。有人可以看看这个,并给我一些反馈?

XML文件

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
- <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"> 
     <Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer1.xml" /> 
     <Relationship Id="rId13" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml" /> 
     <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml" /> 
     <Relationship Id="rId7" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header1.xml" /> 
     <Relationship Id="rId12" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml" /> 
     <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml" /> 
     <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Target="numbering.xml" /> 
     <Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/endnotes" Target="endnotes.xml" /> 
     <Relationship Id="rId11" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image3.png" /> 
     <Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml" /> 
     <Relationship Id="rId10" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image2.jpeg" /> 
     <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml" /> 
     <Relationship Id="rId9" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.jpeg" /> 
</Relationships> 

Java代码的

import java.io.*; 

import javax.xml.parsers.SAXParser; 
import javax.xml.parsers.SAXParserFactory; 

import org.xml.sax.*; 
import org.xml.sax.helpers.*; 

public class XMLParser 
{ 
    public static void main(String[] args) throws IOException 
    { 
     traverse(new File("C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items")); 
    } 

    private static final class SaxHandler extends DefaultHandler 
    { 
     // invoked when document-parsing is started: 
     public void startDocument() throws SAXException 
     { 
      System.out.println("Document processing started"); 
     } 

     // notifies about finish of parsing: 
     public void endDocument() throws SAXException 
     { 
      System.out.println("Document processing finished"); 
     } 

     // we enter to element 'qName': 
     public void startElement(String uri, String localName, 
       String qName, Attributes attrs) throws SAXException 
     { 
      if(qName.equalsIgnoreCase("Relationship")) 
     { 
      String val = attrs.getValue("Target"); 
      if(val != null) 
      { 
       if (val.contains("image")) 
       { 
        String id = attrs.getValue("Id"); 
        System.out.println("Id: " + id + "& Target: " + val); 
       } 
      } 
     } 
     else if(qName.equalsIgnoreCase("Relationships")) 
     { 
      //do nothing 
     } 
     else 
     { 
      throw new IllegalArgumentException("Element '" + 
        qName + "' is not allowed here"); 
     } 
     } 

     // we leave element 'qName' without any actions: 
     public void endElement(String uri, String localName, String qName) 
     throws SAXException 
     { 
       // do nothing; 
     } 
    } 

    private static void traverse(File directory) 
    { 
     //Get all files in directory 
     File[] files = directory.listFiles(); 
     for (File file : files) 
     { 
      if (file.isDirectory()) 
      { 
       //It's a directory so (recursively) traverse it 
       traverse(file); 
      } 
      else if (file.getName().equals("document.xml.rels")) 
      { 
       try 
       { 
        System.out.println("5"); 
        // creates and returns new instance of SAX-implementation: 
        SAXParserFactory factory = SAXParserFactory.newInstance(); 

        // create SAX-parser... 
        SAXParser parser = factory.newSAXParser(); 

        // .. define our handler: 
        SaxHandler handler = new SaxHandler(); 

        // and parse: 
        parser.parse(file.getAbsolutePath(), handler);  
       } 
       catch (Exception ex) 
       { 
        ex.printStackTrace(System.out); 
       } 
      } 
     } 
    } 
} 

错误

5 
Document processing started 
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here 
    at XMLParser$SaxHandler.startElement(XMLParser.java:57) 
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) 
    at javax.xml.parsers.SAXParser.parse(Unknown Source) 
    at javax.xml.parsers.SAXParser.parse(Unknown Source) 
    at XMLParser.traverse(XMLParser.java:96) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.main(XMLParser.java:13) 
5 
Document processing started 
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here 
    at XMLParser$SaxHandler.startElement(XMLParser.java:57) 
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) 
    at javax.xml.parsers.SAXParser.parse(Unknown Source) 
    at javax.xml.parsers.SAXParser.parse(Unknown Source) 
    at XMLParser.traverse(XMLParser.java:96) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.main(XMLParser.java:13) 
5 
Document processing started 
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here 
    at XMLParser$SaxHandler.startElement(XMLParser.java:57) 
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) 
    at javax.xml.parsers.SAXParser.parse(Unknown Source) 
    at javax.xml.parsers.SAXParser.parse(Unknown Source) 
    at XMLParser.traverse(XMLParser.java:96) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.main(XMLParser.java:13) 
5 
Document processing started 
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here 
    at XMLParser$SaxHandler.startElement(XMLParser.java:57) 
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) 
    at javax.xml.parsers.SAXParser.parse(Unknown Source) 
    at javax.xml.parsers.SAXParser.parse(Unknown Source) 
    at XMLParser.traverse(XMLParser.java:96) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.traverse(XMLParser.java:79) 
    at XMLParser.main(XMLParser.java:13) 

作业输出感谢豪尔赫M.

Document processing started 
Id: rId13& Target: media/image3.jpeg 
Id: rId18& Target: media/image8.jpeg 
Id: rId26& Target: media/image16.jpeg 
Id: rId39& Target: media/image29.jpeg 
Id: rId21& Target: media/image11.jpeg 
Id: rId34& Target: media/image24.jpeg 
Id: rId7& Target: media/image1.jpeg 
Id: rId12& Target: media/image2.jpeg 
Id: rId17& Target: media/image7.jpeg 
Id: rId25& Target: media/image15.jpeg 
Id: rId33& Target: media/image23.jpeg 
Id: rId38& Target: media/image28.jpeg 
Id: rId16& Target: media/image6.jpeg 
Id: rId20& Target: media/image10.jpeg 
Id: rId29& Target: media/image19.jpeg 
Id: rId24& Target: media/image14.jpeg 
Id: rId32& Target: media/image22.jpeg 
Id: rId37& Target: media/image27.jpeg 
Id: rId15& Target: media/image5.jpeg 
Id: rId23& Target: media/image13.jpeg 
Id: rId28& Target: media/image18.jpeg 
Id: rId36& Target: media/image26.jpeg 
Id: rId19& Target: media/image9.jpeg 
Id: rId31& Target: media/image21.jpeg 
Id: rId14& Target: media/image4.jpeg 
Id: rId22& Target: media/image12.jpeg 
Id: rId27& Target: media/image17.jpeg 
Id: rId30& Target: media/image20.jpeg 
Id: rId35& Target: media/image25.jpeg 
Document processing finished 
Document processing started 
Id: rId11& Target: media/image2.png 
Id: rId9& Target: media/image1.jpeg 
Document processing finished 
Document processing started 
Id: rId11& Target: media/image3.png 
Id: rId10& Target: media/image2.jpeg 
Id: rId9& Target: media/image1.jpeg 
Document processing finished 
Document processing started 
Id: rId8& Target: media/image2.jpeg 
Id: rId13& Target: media/image5.jpeg 
Id: rId7& Target: media/image1.jpeg 
Id: rId12& Target: media/image4.jpeg 
Id: rId17& Target: media/image8.png 
Id: rId15& Target: media/image7.jpeg 
Id: rId9& Target: media/image3.jpeg 
Id: rId14& Target: media/image6.jpeg 
Document processing finished 

预先感谢您的帮助!

+0

你的XML是什么样的?副手,我想你有两个顶级元素。 –

+1

您的代码可能没有问题。问题似乎与源文件。该文档似乎不符合该模式。只要删除dtd定义并尝试使用sax读取文件,它应该可以工作(验证我的假设) –

+0

@Ted Hopp,@ doc_180对不起,我忘记了第一次包含xml文档。它现在在原文中。 –

回答

1

在每个开始元素上,检查它是否等于“关系”,但第一个元素是“关系”,因此它不相等,并且引发异常。这就是你实现到目前为止的行为;)

那码IM的和平引用到:

public void startElement(String uri, String localName, 
       String qName, Attributes attrs) throws SAXException 
     { 
      if(localName.equalsIgnoreCase("Relationship")) 
      { 
       ..... 
      } 
      else 
      { 
       throw new IllegalArgumentException("Element '" + 
         qName + "' is not allowed here"); 
      } 
     } 

一个可能的解决方案(这绝对不是良好的风格,但解决了你所面临的问题)

public void startElement(String uri, String localName, 
       String qName, Attributes attrs) throws SAXException 
     { 
      if(qName.equalsIgnoreCase("Relationship")) 
      { 
       ..... 
      } 
      else if (qName.equalsIgnoreCase("Relationships") { 
        // do nothing 
      } 
      else 
      { 
       throw new IllegalArgumentException("Element '" + 
         qName + "' is not allowed here"); 
      } 
     } 
+0

@Joerg M.嗯,是的,我看到从现在开始错误的来源。但这是否意味着localName不是'关系'?如果是这样的话,我应该开始访问'关系'? –

+0

看这个方法是一个回调处理程序。此回调处理程序在xml中的每个开始标记处被触发......但本示例“关系”中的根元素也是开始标记。你可以很容易地解决这个问题,用elseif处理localName等于“Relationships”的情况。 – fyr

+0

@Joerg M.那么我遵循你的建议,并且包含了一个新的else-if语句来处理'Relationships'标签。但是,我仍然遇到同样的错误。我包含了我在原始代码中所做的更改。 –