2013-10-08 39 views
1

在我的XML行为我有一个多行的元素:奇怪的字符()与SAX + Java的

<tag id="sometag" ...> 
    | first line 
    |  second line 
    |   third line 
    |  fourth line 
<tag ...> 
.... 
<tag id="someothertag" ...> 
    | ANOTHER FIRST LINE 
    |  ANOTHER SECOND LINE 
    |   ANOTHER THIRD LINE 
    |  ANOTHER FORTH LINE 
<tag ...> 

然后在Java中我有必要startElementendElementcharacters方法,但我发现我得到一些奇怪的行为与characters

public void characters(char[] ch, int start, int length){ 
    Log.d(TAG, "characters("\"" + (new String(ch)).replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + ")"); 
} 

除此之外,我对字符什么都不做。我基本上创建了一个解析器的两个实例。有一个例子,我正在寻找sometag。如果我找到要查找的内容并返回该元素,则会抛出异常。

D/MyProgram(1565): STARTING document parsing... 
D/MyProgram(1565): characters("n ", 0, 1) 
D/MyProgram(1565): characters("  | first line", 0, 20) 
D/MyProgram(1565): characters("n  | first line", 0, 1) 
D/MyProgram(1565): characters("  | second line", 0, 23) 
D/MyProgram(1565): characters("n  | second line", 0, 1) 
D/MyProgram(1565): characters("  |  third line", 0, 26) 
D/MyProgram(1565): characters("n  |  third line", 0, 1) 
D/MyProgram(1565): characters("  | fourth lineline", 0, 22) 
D/MyProgram(1565): characters("n  | fourth lineline", 0, 1) 
D/MyProgram(1565): characters("  | fourth lineline", 0, 4) 
D/MyProgram(1565): Successfully found "sometag"! 

...和另一个全新的实例,我正在寻找someothertag。我做了和以前一样的事情。

D/MyProgram(1565): STARTING document parsing... 
D/MyProgram(1565): characters("n", 0, 1) 
D/MyProgram(1565): characters(" ", 0, 4) 
D/MyProgram(1565): characters("n ", 0, 1) 
D/MyProgram(1565): characters("  | first line", 0, 20) 
D/MyProgram(1565): characters("n  | first line", 0, 1) 
D/MyProgram(1565): characters("  | second line", 0, 23) 
D/MyProgram(1565): characters("n  | second line", 0, 1) 
D/MyProgram(1565): characters("  |  third line", 0, 26) 
D/MyProgram(1565): characters("n  |  third line", 0, 1) 
D/MyProgram(1565): characters("  | fourth lineline", 0, 22) 
D/MyProgram(1565): characters("n  | fourth lineline", 0, 1) 
D/MyProgram(1565): characters("  | fourth lineline", 0, 4) 
D/MyProgram(1565): Successfully found "someothertag"! 

我明白,XML解析是基于流的(它解析块而不是整个字符串),但这是非常奇怪的行为。这里有几件事我注意到,真的是让人眼花缭乱:

  • 随着人物的每一次迭代(),解析器没有启动离开的地方或整理的字符,如果它,的确,完成解析:我m甚至得到之前之前的第一个字符数组('n',它是换行符)。
  • ch有最初不存在的额外字符:“line”被追加到“forth line”。
  • 当我创建一个全新的解析器实例时,这些字符被“重新读取”​​。第二个执行应该读的东西,如:

..this ...

D/MyProgram(1565): characters("n", 0, 1) 
D/MyProgram(1565): characters(" ", 0, 4) 
D/MyProgram(1565): characters("n ", 0, 1) 
D/MyProgram(1565): characters("  | ANOTHER FIRST LINE", 0, 20) 
D/MyProgram(1565): characters("n  |  ANOTHER SECOND LINE", 0, 1) 

...等等。

任何想法我做错了什么?提前致谢。

+3

看起来像你不尊重开始和长度。 – bmargulies

回答

3

正如Margulies所说,你在传递的字符数组中不使用startlength

public void characters(char[] ch, int start, int length) { 
    // use only the indicated segment. 
    String str = new String(ch, start, length); 
    Log.d(TAG, "characters("\"" + str.replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + ")"); 
} 
+0

我遇到的另一个问题是解析器的字符串生成器是静态的。我需要使用builder.setLength()重置它。 – i41