从超链接剥离空白和点

我想从超链接中删除空白和点所有规则都工作正常，除非它不从网址中删除点。这里有几个例子：从超链接剥离空白和点

<a href=" http://www.example.com ">example site</a> 
<a href=" http://www.example.com">example 2</a> 
<a href="http://www.example.com.">final example</a> 


    $text = preg_replace('/<a href="([\s]+)?([^ "\']*)([\s]+)?(\.)?">([^<]*)<\/a>/', '<a href="\\2">\\5</a>', $text);

在最后一个例子中，RE应该从url中删除点。点是可选的，所以我写了这个规则（。）？

来源

2011-11-29 Maximus

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags –

因为你的点是由([^ "\']*)组已经匹配。

将其更改为([^ "\']*?) - 不确定版本。

同时，我建议你用[\s.]*取代([\s]+)?(\.)?处理 “www.example.com。 ” 字符串。

来源

2011-11-29 23:30:14 mifki

这工作完美！谢谢。 – Maximus

<a href="([\s]+)?([^ "\']*\.[a-zA-Z]{2,5})([\s]+)?(\.)?">([^<]*)<\/a>怎么样？ .[a-zA-Z]{2,5}？

它可以捉住.COM，.INFO，埃杜甚至像.com.au

来源

2011-11-29 23:27:52

感谢它完美的像http://xyz.com这样的网址，但它将不工作http://xyz.com/another-page我试图解决您的RE – Maximus

以下是未经测试。

$doc = new DOMDocument; 
$doc->preserveWhiteSpace = false; 
$doc->Load('source.html'); 

$xpath = new DOMXPath($doc); 

// We starts from the root element 
$query = 'a'; 

$anchors = $xpath->query('a'); 

foreach($anchors as $aElement) { 
    $aElement->setAttribute('href', trim($aElement->getAttribute('href'), ' .')); 
} 

$doc->saveHTMLFile('new-source.html');

来源

2011-11-29 23:35:30

这将修剪hrefs（我假设你的意思是修剪它们）。

两个'"值定界符（扩）：

(<a \s+ href \s* = \s*) 
(?| 
    (") \s* ([^"]*?) [\.\s]* (") 
    | (') \s* ([^']*?) [\.\s]* (') 
) 
([^>]*>)

替换是：$1$2$3$4$5

，或者

只是"值定界符（扩）：

(<a \s+ href \s* = \s* ") 
\s* 
([^"]*?) 
[\.\s]* 
(" [^>]*>)

替换是：$1$2$3

来源

2011-11-30 00:22:01 sln

从超链接剥离空白和点

回答

相关问题