2012-06-30 28 views
1

我无法找到此问题的最佳解决方案。这个想法是将包含特定域名的文本的所有网址更改为基于64位编码的preg_replace_callback()。这个类型的网址是:http://www.domain.com/?fsdf76sf8sf6fds,另一种方式是:http://www.otherdomain.com/file/CA60D10F8ACF7CAAURL中的preg_replace域名在PHP中

正则表达式的任何想法?

+1

什么是您预期的结果? –

+0

@Ravinder Singh修正了问题,是一个preg_replace_callback()来转换具有base 64功能的URL – Maxpower

+0

“CA60D10F8ACF7CAA”的输出是从字面上理解的吗? 'base64_encode('fsdf76sf8sf6fds')'返回“ZnNkZjc2c2Y4c2Y2ZmRz” –

回答

1

你在找什么是沿

$s = preg_replace_callback('#(([a-z]+://)|([a-z]+://)?[a-z0-9.-]+\.|\b)domain.com[^\s]+#i', function($match) { 
    return base64_encode($match[0]); 
}, $string); 

线条的东西,正则表达式可能会有点混乱,所以让我们把它分解:

( -- domain.com must be preceeded by either 
    ([a-z]+://) -- a protocol such as http:// 
    | 
    ([a-z]+://)?[a-z0-9.-]+\. -- possibly a protocol and definitely a subdomain 
    | 
    \b -- word-break (prevents otherdomain.com from matching!) 
) 
domain.com -- the actual domain you're looking for 
[^\s]+ -- everything up to the next space (to include path, query string, fragment) 

一个非常简单的系统测试这样的东西:

<?php 

$strings = array(
    // positives 
    'a http://www.domain.com/?fsdf76sf8sf6fds z' => 'a xxx z', 
    'a www.domain.com/?fsdf76sf8sf6fds z' => 'a xxx z', 
    'a http://domain.com/?fsdf76sf8sf6fds z' => 'a xxx z', 
    'a domain.com/?fsdf76sf8sf6fds z' => 'a xxx z', 
    // negatives 
    'a http://www.otherdomain.com/file/CA60D10F8ACF7CAA z' => null, 
    'a www.otherdomain.com/file/CA60D10F8ACF7CAA z' => null, 
    'a http://otherdomain.com/file/CA60D10F8ACF7CAA z' => null, 
    'a otherdomain.com/file/CA60D10F8ACF7CAA z' => null, 
); 

foreach ($strings as $string => $result) { 
    $s = preg_replace_callback('#(([a-z]+://)|([a-z]+://)?[a-z0-9.-]+\.|\b)domain.com[^\s]+#i', function($match) { 
     return 'xxx'; 
    }, $string); 

    if (!$result) { 
     $result = $string; 
    } 

    if ($s != $result) { 
     echo "FAILED: '$string' got '$s'\n"; 
    } else { 
     echo "OK: '$string'\n"; 
    } 
} 

(你应该已经是单元测试,使用该inst很明显...)

1

此答案只适用于以“http://www.domain.com/”或“https://www.domain.com/”开头的网址,但更精简:

$in = 'before http://www.domain.com/?fsdf76sf8sf6fds after'; 

$domain = 'www.domain.com'; 
echo preg_replace_callback('/\b(https?:\/\/'.preg_quote($domain).'\/)\?(\w+)/i', function($m) { 
    return 'http://www.otherdomain.com/file/'.base64_encode($m[2]); 
}, $in); 
// outputs "before http://www.otherdomain.com/file/ZnNkZjc2c2Y4c2Y2ZmRz after" 

一个问题还是要解决的是“CA60D10F8ACF7CAA”你的样本输出显示base64编码输出不同的是什么PHP的BASE64_ENCODE()返回:

echo base64_encode('fsdf76sf8sf6fds'); // outputs ZnNkZjc2c2Y4c2Y2ZmRz