正则表达式来获得<a href> from a string in java

Suppose I have正则表达式来获得<a href> from a string in java

<img class="size-full wp-image-10225" alt="animals" src="abc.jpg"> blah blah blah&nbsp; 
<a href="http://en.wikipedia.org/wiki/Elephant">elephant is an animal</a>&nbsp;blah

I want a regex to give me the output :

blah blah blah <a href="http://en.wikipedia.org/wiki/Elephant">elephant is an animal</a> blah

without the  . I can do str.replace(" ","") separately, but how do I get the string starting from blah blah... until blah (which includes link tag).

来源

2014-03-28 user3298846

您必须单独删除'img'标签。你只需要a-Tag？这与RegExpr一起工作。如果您想在标签前后获得其他文本，请在此处遇到问题。为什么你不容易删除不需要的标签？ –

我确实需要标签之前的文字。所以基本上我不能说StringUtils.removeHTMLTags（），因为这将删除所有的标签，我想要的HTML标签。所以基本上我在想什么是找到ahref之前的第一个“>”，然后从那里捕获文本，直到（含） – user3298846

_Sees正则表达式和HTML在title_“http://stackoverflow.com/a/1732454/2846923 “。 –

Maybe something like this?

^<[^>]*>\s*|&nbsp;

Java escaped:

^<[^>]*>\\s*|&nbsp;

regex101 demo

^<[^>]*>\\s*将第一img标签以及任何后续的空间相匹配。然后替换 。替换字符串是""。

虽然您可能想要使用适当的HTML解析器，因为它不太可能中断。

来源

2014-03-28 19:16:45 Jerry

嘿谢谢杰瑞。虽然我没有得到java转义部分。所以我应该这样做：str.replace（^ <[^>] *> \ s * | ，“”） – user3298846

@ user3298846在Java中使用转义版本。 :) – Jerry

是的，对不起，我没有看完整的问题。 – Andres

正则表达式来获得<a href> from a string in java

回答

相关问题