Java的正则表达式替换所有

我的文字看起来像这样Java的正则表达式替换所有

| birth_date   = {{birth date|1925|09|2|df=y}} 
| birth_place   = [[Bristol]], [[England]], UK 
| death_date   = {{death date and age|2000|11|16|1925|09|02|df=y}} 
| death_place   = [[Eastbourne]], [[Sussex]], England, UK 
| origin    = 
| instrument   = [[Piano]] 
| genre    = 
| occupation   = [[Musician]]

我想获得的一切，是的[[]]中。我尝试使用replace all替换不在[[]]内部的所有内容，然后使用新行分割来获取[[]]的文本列表。

input = input.replaceAll("^[\\[\\[(.+)\\]\\]]", "");

需要的输出：

[[Bristol]] 
[[England]] 
[[Eastbourne]] 
[[Sussex]] 
[[Piano]] 
[[Musician]]

但是，这是不是给所需的输出。我在这里错过了什么？有成千上万的文件，这是获得它的最快方法吗？如果不是，请告诉我获得所需输出的最佳方式。

来源

2013-10-04 NEO

除了其它问题，请注意，'（+）'是“贪婪”量词将抓住尽可能多的字符因为它可以在'[['和']]'之间，这意味着'birth_place'你会得到''Bristol]]，[[英格兰'''作为其中一场比赛。在'。+'之后加上'？'，就像在falsetru的答案中一样，阻止了这一点。 – ajb

你需要匹配它不能代替

Matcher m=Pattern.compile("\\[\\[\\w+\\]\\]").matcher(input); 
while(m.find()) 
{ 
    m.group();//result 
}

来源

2013-10-04 16:22:10 Anirudha

@ ppeterka66是的，它会.. – Anirudha

对不起，在我尝试自己之前，我已经够哑了，要问:) – ppeterka

使用Matcher.find。例如：

import java.util.regex.*; 

... 

String text = 
    "| birth_date   = {{birth date|1925|09|2|df=y}}\n" + 
    "| birth_place   = [[Bristol]], [[England]], UK\n" + 
    "| death_date   = {{death date and age|2000|11|16|1925|09|02|df=y}}\n" + 
    "| death_place   = [[Eastbourne]], [[Sussex]], England, UK\n" + 
    "| origin    = \n" + 
    "| instrument   = [[Piano]]\n" + 
    "| genre    = \n" + 
    "| occupation   = [[Musician]]\n"; 
Pattern pattern = Pattern.compile("\\[\\[.+?\\]\\]"); 
Matcher matcher = pattern.matcher(text); 
while (matcher.find()) { 
    System.out.println(matcher.group()); 
}

来源

2013-10-04 16:22:23 falsetru

只是为了好玩，使用replaceAll：

String output = input.replaceAll("(?s)(\\]\\]|^).*?(\\[\\[|$)", "$1\n$2");

来源

2013-10-04 16:37:05 femtoRgon

Java的正则表达式替换所有

回答

相关问题