2009-07-15 49 views
0

我的文件系统上存在Windows XP上的文件。我想用Java解析它们(JRE 1.6)。使用Java和文件路径中的空格解析XML文件

问题是,我不明白Java和Xerces如何在文件路径中有空格时一起工作。

如果该文件的路径中没有空格,则一切正常。

如果有空间,我可能有这样的麻烦,即使我调用分析器与一个FileInputStream实例

java.net.UnknownHostException: . 
    at java.net.PlainSocketImpl.connect(Unknown Source) 
    at java.net.Socket.connect(Unknown Source) 
    at java.net.Socket.connect(Unknown Source) 
    at sun.net.NetworkClient.doConnect(Unknown Source) 
    at sun.net.NetworkClient.openServer(Unknown Source) 
    at sun.net.ftp.FtpClient.openServer(Unknown Source) 
    at sun.net.ftp.FtpClient.openServer(Unknown Source) 
    at sun.net.www.protocol.ftp.FtpURLConnection.connect(Unknown Source) 
    at sun.net.www.protocol.ftp.FtpURLConnection.getInputStream(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) 
    at javax.xml.parsers.DocumentBuilder.parse(Unknown Source) 

sun.net.ftp.FtpClient.openServer ???跆拳道?)

或否则这种麻烦:

java.net.MalformedURLException: unknown protocol: d 
    at java.net.URL.<init>(Unknown Source) 
    at java.net.URL.<init>(Unknown Source) 
    at java.net.URL.<init>(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source) 
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) 
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) 

(它说unknown protocol: d因为,我想,该文件是在D驱动器)。

有没有人知道为什么会发生这种情况,以及如何规避问题?我试图提供我自己的EntityResolver,但是我的日志告诉我它在崩溃之前甚至没有调用。


编辑:

这里是调用分析器的代码。

public Document fileToDom(File file) throws ProcessException { 
    Document doc = null; 
    try { 
     DocumentBuilderFactory db = DocumentBuilderFactory.newInstance(); 
     DocumentBuilder builder = db.newDocumentBuilder(); 
     if (this.errorHandler!=null){ 
      builder.setErrorHandler(this.errorHandler);} 
     else { 
      builder.setErrorHandler(new DefaultHandler()); 
     } 
     FileInputStream test= new FileInputStream(file); 
     doc = builder.parse(test); 
     ... 
    } catch (Exception e) {...} 
    ... 
} 

就目前而言,我发现自己被迫解析,从而消除所有问题之前除去DOCTYPE,和DTD验证......没那么大的解决方案。

+0

你能告诉你正在使用调用XML解析器的代码?您应该考虑使用URI路径。 – notnoop 2009-07-15 15:24:28

回答

1

试试这个URI风格:

文件:/// d:/folder/folder%20with%20space/file.xml

2

你只是使用DocumentBuilder.parse(filename)

如果是这样,那就失败了,因为它期望一个URI。打开文件FileInputStream,然后将其传递给DocumentBuilder.parse(InputStream)

+0

我正在使用DocumentBuilder.parse(InputStream)。 – glmxndr 2009-07-15 15:34:00

1

它看起来好像试图连接到doctype头中的一个URL,因此它可以下载它以便根据下载的DTD验证文档。

0

试试这个。

InputSource is = new InputSource(); 
is.setCharacterStream(new StringReader(test)); 
doc = builder.parse(is); 

,而不是仅仅解析“测试”