的Java删除子字符串，其中包括引号

String strLine = ""; 

    try 
    { 
     BufferedReader b = new BufferedReader(new FileReader("html.txt")); 
     strLine = b.readLine(); 
    } catch(Exception e) 
    { 
     e.printStackTrace(); 
    } 

    String[] temp = strLine.split("<"); 
    temp = temp[1].split(">"); 
    String temp1 = ("<"+temp[0]+">"); 

    strLine = strLine.replaceFirst(temp1,""); 
    System.out.println(strLine);

基本上我想从一个包含的Java删除子字符串，其中包括引号

<span title="Representation in the International Phonetic Alphabet (IPA)" class="IPA">no'b?l</span>

但是文件中删除这个字符串

<span title="Representation in the International Phonetic Alphabet (IPA)" class="IPA">

到目前为止，如果我的代码只能字符串不包含引号。我该如何解决这个问题。我曾尝试过使用

.replaceAll("\\\"","\\\\\"");

但仍然失败。

任何帮助或信息将大大apreciated。

来源

2011-07-11 Jake

您应该使用HTML解析器。 – SLaks

是的，这就是我最终想要的，而且这段代码没有引号就能正常工作。 – Jake

请参阅http://stackoverflow.com/questions/240546/removing-html-from-a-java-string。它非常简单 – itsadok

你的问题是，replaceFirst接受一个正则表达式，但是你给它一个任意的字符串，可能包含各种在正则表达式中具有特定含义的特殊字符。我不认为引号是您的问题，而是括号中的问号。

解决此问题的一种方法是使用String#replace方法，该方法接受字符串而不是正则表达式。也就是说，使用以下行：

strLine = strLine.replace(temp1,"");

这不同于你的代码，它取代temp1中的所有实例在该行，而不只是第一个，但我认为你应该罚款这一点。

来源

2011-07-11 12:55:36 itsadok

是的谢谢你，将strLine.replaceAll改为strLine.replace解决了我的问题。 – Jake

如果您正确转义，AFAIK replaceAll("///"","/////"");会起作用：转义字符是\而不是/。尝试使用它。

来源

2011-07-11 12:36:37 tsm

temp1 = temp1.replaceAll（“\\\”“，”\\\\\“”）;我试过这个，但它仍然不起作用。也许它不是因为引号。 – Jake

的Java删除子字符串，其中包括引号

回答

相关问题