2013-04-25 41 views
3

我正在使用StringEscapeUtils转义和unescape html。我有以下代码字符串内容相同,但等于方法返回false

import org.apache.commons.lang.StringEscapeUtils; 

public class EscapeUtils { 

    public static void main(String args[]) { 

     String string = " 4-Spaces ,\"Double Quote\", 'Single Quote', \\Back-Slash\\, /Forward Slash/ "; 

     String escaped = StringEscapeUtils.escapeHtml(string); 
     String myEscaped = escapeHtml(string); 

     String unescaped = StringEscapeUtils.unescapeHtml(escaped); 
     String myUnescaped = StringEscapeUtils.unescapeHtml(myEscaped); 

     System.out.println("Real String: " + string); 
     System.out.println(); 
     System.out.println("Escaped String: " + escaped); 
     System.out.println("My Escaped String: " + myEscaped); 
     System.out.println(); 
     System.out.println("Unescaped String: " + unescaped); 
     System.out.println("My Unescaped String: " + myUnescaped); 
     System.out.println(); 
     System.out.println("Comparison:"); 
     System.out.println("Real String == Unescaped String: " + string.equals(unescaped)); 
     System.out.println("Real String == My Unescaped String: " + string.equals(myUnescaped)); 
     System.out.println("Unescaped String == My Unescaped String: " + unescaped.equals(myUnescaped)); 

    } 

    public static String escapeHtml(String s) { 
     String escaped = ""; 
     if(null != s) { 
      escaped = StringEscapeUtils.escapeHtml(s); 
      escaped = escaped.replaceAll(" "," "); 
      escaped = escaped.replaceAll("'","'"); 
      escaped = escaped.replaceAll("\\\\","\"); 
      escaped = escaped.replaceAll("/","/"); 
     } 
     return escaped; 
    } 

} 

输出:

Real String:  4-Spaces ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 

Escaped String:  4-Spaces ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 
My Escaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/  

Unescaped String:  4-Spaces ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 
My Unescaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/  

Comparison: 
Real String == Unescaped String: true 
Real String == My Unescaped String: false 
Unescaped String == My Unescaped String: false 

escaped真正string,然后unescaped它。但myEsceped首先用相同的进程转义,然后用他们的html代码替换一些更多的html字符。 myUnescaped实际上是myEscaped的内容,它与真实字符串的内容相同。

输出显示真实stringunescapedmyUnescaped内容相同。但是,如比较部分所示,myUnescaped不等于stringunescaped

我不明白它在这里实际发生了什么。任何人都可以解释吗?

+0

噢,我的头在旋转 – muneebShabbir 2013-04-25 06:44:50

+0

可以请你调试和检查字符串的字符数组,验证和请分享 – muneebShabbir 2013-04-25 06:52:47

+0

我不看行'转义的字符串==我转义的字符串:'在你的代码。你可以在你的程序中添加这个比较的部分吗? – Patashu 2013-04-25 07:04:04

回答

3

这是由于当逃脱的HTML,你与 

public static String escapeHtml(String s) { 
     String escaped = ""; 
     if(null != s) { 
      escaped = StringEscapeUtils.escapeHtml(s); 
      escaped = escaped.replaceAll(" "," "); // HERE 
      escaped = escaped.replaceAll("'","'"); 
      escaped = escaped.replaceAll("\\\\","\"); 
      escaped = escaped.replaceAll("/","/"); 
     } 
     return escaped; 
    } 

虽然StringEscapeUtils.escapeHtml更换' '不逃避' ',下面是他们的网站的例子:

"bread" & "butter" 

成为

"bread" & "butter" 

这意味着StringEscapeUtils.escapeHtml保留空间

如果从escapeHtml删除escaped = escaped.replaceAll(" "," ");unescapedmyUnescaped比赛!

1

Apurv Answer之后,我分析了字节数组的字节。

String:  32, 32, 32, 32, 52, 45, 83, 112, 97, 99, 101, 115, 32, 32, 32, 32, 44, 34, 68, 111, 117, 98, 108, 101, 32, 81, 117, 111, 116, 101, 34, 44, 32, 39, 83, 105, 110, 103, 108, 101, 32, 81, 117, 111, 116, 101, 39, 44, 32, 92, 66, 97, 99, 107, 45, 83, 108, 97, 115, 104, 92, 44, 32, 47, 70, 111, 114, 119, 97, 114, 100, 32, 83, 108, 97, 115, 104, 47, 32 
unescaped : 32, 32, 32, 32, 52, 45, 83, 112, 97, 99, 101, 115, 32, 32, 32, 32, 44, 34, 68, 111, 117, 98, 108, 101, 32, 81, 117, 111, 116, 101, 34, 44, 32, 39, 83, 105, 110, 103, 108, 101, 32, 81, 117, 111, 116, 101, 39, 44, 32, 92, 66, 97, 99, 107, 45, 83, 108, 97, 115, 104, 92, 44, 32, 47, 70, 111, 114, 119, 97, 114, 100, 32, 83, 108, 97, 115, 104, 47, 32 
myUnescaped: -96, -96, -96, -96, 52, 45, 83, 112, 97, 99, 101, 115, -96, -96, -96, -96, 44, 34, 68, 111, 117, 98, 108, 101, -96, 81, 117, 111, 116, 101, 34, 44, -96, 39, 83, 105, 110, 103, 108, 101, -96, 81, 117, 111, 116, 101, 39, 44, -96, 92, 66, 97, 99, 107, 45, 83, 108, 97, 115, 104, 92, 44, -96, 47, 70, 111, 114, 119, 97, 114, 100, -96, 83, 108, 97, 115, 104, 47, -96 

我似乎myUnescaped,空间已经转换为ASCII -96而不是32

所以我写了unescapeHtml方法如下。此方法首先用空格替换&nbsp,然后使用StringEscapeUtils来查看html。

public static String unescapeHtml(String s) { 
    String unescaped = ""; 
    if(null != s) { 
     unescaped = s.replaceAll(" ", " "); 
     unescaped = StringEscapeUtils.unescapeHtml(unescaped); 
    } 
    return unescaped; 
} 

然后我得到了myUnescaped使用下面的代码。

String myUnescaped = unescapeHtml(myEscaped); 

这给了我myUnescaped串等于stringunescaped

ALTERNATIVELY我用 替换 。这并不要求我写unescapeHtml mehod。更新了escapeHtml方法的代码如下。

public static String escapeHtml(String s) { 
    String escaped = ""; 
    if(null != s) { 
     escaped = StringEscapeUtils.escapeHtml(s); 
     escaped = escaped.replaceAll(" "," "); //updated line 
     escaped = escaped.replaceAll("'","'"); 
     escaped = escaped.replaceAll("\\\\","\"); 
     escaped = escaped.replaceAll("/","/"); 
    } 
    return escaped; 
} 
相关问题