2013-12-19 104 views
1

我需要使用php解析单引号文本(注意多个嵌套=单个),类似于论坛引用标记。例如:正则表达式引用标签php

some nonquoted text1 
[quote="person1"]some quoted text11[/quote] 
some nonquoted text2 
[quote="person2"]some quoted text22[/quote] 
etc... with no newlines necessarily... 

结果应该是像数组

Array 
     (
      ['nonquoted'] => Array 
       (
        [0] => some unquoted text1 
        [1] => some unquoted text2 
       ) 
      ['quoted'] => Array 
       { 
        [0] => Array 
         (
          [0] => person1 
          [1] => some quoted text11 
         ) 

        [1] => Array 
         (
          [0] => person2 
          [1] => some quoted text22 
         ) 
       } 
     } 
+0

...和你尝试过什么? –

+0

下面是一个让你开始的例子,它只会匹配引用的内容,但它肯定不会创建你以后的漂亮结构:http://regex101.com/r/mN0yL1 –

+0

索引数组应该是用于统一数据,异构数据应该是关联数组。因此'$ array ['quoted']'中的内部数组应该像'Array('name'=>'person1','text'=>'some quoted text11')'。 – Barmar

回答

0
$input= <<<EOL 
some nonquoted text1 
[quote="person1"]some quoted text11[/quote] 
some nonquoted text2 
[quote="person2"]some quoted text22[/quote] 
EOL; 

$result = Array('unquoted'=>Array(), 'quoted'=>Array()); 

//find [quote] blocks, replace them with nothing, and store the text in $result['quoted'] 
$unquoted = preg_replace_callback('@\[quote="([^\"]+)"\](.*)\[/quote\]@',function($m) use(&$result){ 
    $result['quoted'][]=Array($m[1],$m[2]); 
},$input); 

//what's left is only unquoted lines, so split them into an array 
$result['unquoted']=preg_split('@[\r\n][email protected]',$unquoted); 

//your result 
print_r($result); 
+0

就是这样。谢谢肖恩。 –