简单的wiki分析器和链接自动检测

我使用了以下功能：简单的wiki分析器和链接自动检测

function MakeLinks($source){ 
return preg_replace('!(((f|ht){1}tp://)[-a-zA-Zа-яА-Я()[email protected]:%_+.~#?&;//=]+)!i', '<a href="/1">$1</a>', $source); 
} 

function simpleWiki($text){ 
$text = preg_replace('/\[\[Image:(.*)\]\]/', '<a href="$1"><img src="$1" /></a>', $text); 
return $text; 
}

第一个转换成http://example.com链接http://example.com。

第二个函数将像[[Image:http://example.com/logo.png]]这样的字符串转换为图像。

现在，如果我有一个文本

$text = 'this is my image [[Image:http://example.com/logo.png]]';

，并将其转换这样simpleWiki(makeLinks($text))它输出类似于：

this is my image <a href="url"><img src="<a href="url">url</a>"/></a>

我如何避免这种情况？如何检查该URL是不是[[Image:URL]]构造的一部分？

来源

2011-03-12 Denis Bobrovnikov

你眼前的问题可以通过两个表达式相结合来解决成一个（两种选择），然后使用不那么好，known-但-非常强大：preg_replace_callback()函数穿过目标串像这样分别处理每一种情况下在一次通过：

<?php // test.php 20110312_1200 
$data = "[[Image:http://example.com/logo1.png]]\n". 
     "http://example1.com\n". 
     "[[Image:http://example.com/logo2.png]]\n". 
     "http://example2.com\n"; 

$re = '!# Capture WikiImage URLs in $1 and other URLs in $2. 
     # Either $1: WikiImage URL 
     \[\[Image:(.*?)\]\] 
    | # Or $2: Non-WikiImage URL. 
     (((f|ht){1}tp://)[-a-zA-Zа-яА-Я()[email protected]:%_+.~#?&;//=]+) 
     !ixu'; 

$data = preg_replace_callback($re, '_my_callback', $data); 

// The callback function is called once for each 
// match found and is passed one parameter: $matches. 
function _my_callback($matches) 
{ // Either $1 or $2 matched, but never both. 
    if ($matches[1]) { // $1: WikiImage URL 
     return '<a href="'. $matches[1] . 
      '"><img src="'. $matches[1] .'" /></a>'; 
    } 
    else {    // $2: Non-WikiImage URL. 
     return '<a href="'. $matches[2] . 
      '">'. $matches[2] .'</a>'; 
    } 
} 
echo($data); 
?>

此脚本实现ÿ我们的两个正则表达式，并做你在问什么。请注意，我确实将贪婪的(.*)更改为(.*?)懒惰版本，因为贪婪版本无法正常工作（它无法处理多个WikiImages）。我还将'u'修饰符添加到了正则表达式中（当模式包含Unicode字符时需要使用该修饰符）。正如你所看到的，preg回调函数非常强大。（这种技术可以用来做一些非常繁重的工作，但是文本处理方式很明智。）

但是，请注意，您用来挑选URL的正则表达式可以显着提高。看看下面的资源，用于 “Linkifying” URL的详细信息（提示：有一堆 “陷阱” 的）：FYI
The Problem With URLs
An Improved Liberal, Accurate Regex Pattern for Matching URLs
URL Linkification (HTTP/FTP)

来源

2011-03-12 19:33:44 ridgerunner

在你MakeLinks添加此[^:"]{1}，见下图：

function MakeLinks($source){ 
    return preg_replace('![^:"]{1}(((f|ht){1}tp://)[-a-zA-Zа-яА-Я()[email protected]:%_+.~#?&;//=]+)!i', '<a href="/1">$1</a>', $source); 
}

那么只有链接不 “：” 之前（就像图片:)会变换。并使用$text = simpleWiki(MakeLinks($text));。

编辑：您可以使用此更改：preg_replace('![[:space:]](((f|ht){1}tp://)[-a-zA-Zа-яА-Я()[email protected]:%_+.~#?&;//=]+)[[:space:]]!i', '<a href="$1">$1</a>', $source);

来源

2011-03-12 13:47:00 Akarun

：'{1}'是从来没有需要的：它只是乱七八糟的正则表达式。 – 2011-03-12 14:02:24

简单的wiki分析器和链接自动检测

回答

相关问题