2017-10-18 201 views
1

考虑到这个示例文本:如何将变量搜索限制为单行文本?

grupo1, tiago1A, bola1A, mola1A, tijolo1A, pedro1B, bola1B, mola1B, tijolo1B, raimundo1C, bola1C, mola1C, tijolo1C, joao1D, bola1D, mola1D, tijolo1D, felipe1E, bola1E, mola1E, tijolo1E, 

grupo2, tiago2A, bola2A, mola2A, tijolo2A, pedro2B, bola2B, mola2B, tijolo2B, raimundo2C, bola2C, mola2C, tijolo2C, joao2D, bola2D, mola2D, tijolo2D, felipe2E, bola2E, mola2E, tijolo2E, 

grupo3, tiago3A, bola3A, mola3A, tijolo3A, pedro3B, bola3B, mola3B, tijolo3B, raimundo3C, bola3C, mola3C, tijolo3C, joao3D, bola3D, mola3D, tijolo3D, felipe3E, bola3E, mola3E, tijolo3E, 

grupo4, tiago4A, bola4A, mola4A, tijolo4A, pedro4B, bola4B, mola4B, tijolo4B, raimundo4C, bola4C, mola4C, tijolo4C, joao4D, bola4D, mola4D, tijolo4D, felipe4E, bola4E, mola4E, tijolo4E, 

grupo5, tiago5A, bola5A, mola5A, tijolo5A, pedro5B, bola5B, mola5B, tijolo5B, raimundo5C, bola5C, mola5C, tijolo5C, joao5D, bola5D, mola5D, tijolo5D, felipe5E, bola5E, mola5E, tijolo5E, 

我想捕捉遵循grupo3的20个值,并将其存储在4

组我使用这样的:(Demo

/grupo3,((.*?),(.*?),(.*?),(.*?)),/ 

但这只返回grupo3后面的前4个逗号分隔值。

我需要生成该数组结构:

Match 1 
Group 1 tiago3A 
Group 2 bola3A 
Group 3 mola3A 
Group 4 tijolo3A 

Match 2 
Group 1 pedro3B 
Group 2 bola3B 
Group 3 mola3B 
Group 4 tijolo3B 

Match 3 
Group 1 raimundo3C 
Group 2 bola3C 
Group 3 mola3C 
Group 4 tijolo3C 

Match 4 
Group 1 joao3D 
Group 2 bola3D 
Group 3 mola3D 
Group 4 tijolo3D 

Match 5 
Group 1 felipe3E 
Group 2 bola3E 
Group 3 mola3E 
Group 4 tijolo3E 
+0

为什么你之前'有空间*'在质疑你的正则表达式,但不在演示中? – Barmar

+1

你在regex101上做了什么,你把'grupo3'放在正则表达式的开头? – Barmar

+0

@Makyen好吧,我已经明白了这个问题。如果问题不再需要进一步编辑,我会通知答案更新他们的工作。 – mickmackusa

回答

1

你可以尝试以下方法:

/,(.*?),(.*?),(.*?),(.*?),.*?$/m

到底/m指示multi-line$的标志,指示end of line之前。 Demo

编辑:为了得到每4个元素只能形成第三段

/grupo3,((.*?),(.*?),(.*?),(.*?)), ((.*?),(.*?),(.*?),(.*?)), ((.*?),(.*?),(.*?),(.*?)), ((.*?),(.*?),(.*?),(.*?)), ((.*?),(.*?),(.*?),(.*?)),/ 

Demo

,你可以得到所需的输出在PHP中,如:

preg_match('/grupo3,((.*?),(.*?),(.*?),(.*?)), ((.*?),(.*?),(.*?),(.*?)), ((.*?),(.*?),(.*?),(.*?)), ((.*?),(.*?),(.*?),(.*?)), ((.*?),(.*?),(.*?),(.*?)),/', $str, $matches); 

$groups = []; 
unset($matches[0]); 
$matches = array_values($matches); 
$count = count($matches); 
$j=0; 
for($i=1;$i<$count;$i++) 
{ 
    if($i%5 == 0) 
    { 
     $j++; 
     continue; 
    } 
    $groups[$j][] = $matches[$i]; 

} 

var_dump($groups); 

输出会是这样的:

array (size=5) 
    0 => 
    array (size=4) 
     0 => string ' tiago3A' (length=8) 
     1 => string ' bola3A' (length=7) 
     2 => string ' mola3A' (length=7) 
     3 => string ' tijolo3A' (length=9) 
    1 => 
    array (size=4) 
     0 => string 'pedro3B' (length=7) 
     1 => string ' bola3B' (length=7) 
     2 => string ' mola3B' (length=7) 
     3 => string ' tijolo3B' (length=9) 
    2 => 
    array (size=4) 
     0 => string 'raimundo3C' (length=10) 
     1 => string ' bola3C' (length=7) 
     2 => string ' mola3C' (length=7) 
     3 => string ' tijolo3C' (length=9) 
    3 => 
    array (size=4) 
     0 => string 'joao3D' (length=6) 
     1 => string ' bola3D' (length=7) 
     2 => string ' mola3D' (length=7) 
     3 => string ' tijolo3D' (length=9) 
    4 => 
    array (size=4) 
     0 => string 'felipe3E' (length=8) 
     1 => string ' bola3E' (length=7) 
     2 => string ' mola3E' (length=7) 
     3 => string 'tijolo3E' (length=0) 
+0

看看我发布的内容。我不想要每个段落的四个最初元素。我想将这四个要素(限于第3段)分成5个小组。 –

+0

[link] https://regex101.com/r/Imoozr/3 [/ link] –

+0

@ mega6382。基本上说。但是,如何告诉php,括号中的第一个元素与第二个元素((?P 。*?),..)((?P )...)是同一个组? –

0

请原谅这个答案的迟到。如果这个页面没有被搁置,这是一个干净/直接的解决方案的综合答案。这是一个完善的解决方案,因为我不知道如何生成/访问输入数据。

输入:

$text='grupo1, tiago1A, bola1A, mola1A, tijolo1A, pedro1B, bola1B, mola1B, tijolo1B, raimundo1C, bola1C, mola1C, tijolo1C, joao1D, bola1D, mola1D, tijolo1D, felipe1E, bola1E, mola1E, tijolo1E, 

grupo2, tiago2A, bola2A, mola2A, tijolo2A, pedro2B, bola2B, mola2B, tijolo2B, raimundo2C, bola2C, mola2C, tijolo2C, joao2D, bola2D, mola2D, tijolo2D, felipe2E, bola2E, mola2E, tijolo2E, 

grupo3, tiago3A, bola3A, mola3A, tijolo3A, pedro3B, bola3B, mola3B, tijolo3B, raimundo3C, bola3C, mola3C, tijolo3C, joao3D, bola3D, mola3D, tijolo3D, felipe3E, bola3E, mola3E, tijolo3E, 

grupo4, tiago4A, bola4A, mola4A, tijolo4A, pedro4B, bola4B, mola4B, tijolo4B, raimundo4C, bola4C, mola4C, tijolo4C, joao4D, bola4D, mola4D, tijolo4D, felipe4E, bola4E, mola4E, tijolo4E, 

grupo5, tiago5A, bola5A, mola5A, tijolo5A, pedro5B, bola5B, mola5B, tijolo5B, raimundo5C, bola5C, mola5C, tijolo5C, joao5D, bola5D, mola5D, tijolo5D, felipe5E, bola5E, mola5E, tijolo5E,'; 

的方法:(PHP Demo

var_export(preg_match('/^grupo3, \K.*(?=,)/m',$text,$out)?array_chunk(explode(', ',$out[0]),4):'fail'); 

使用preg_match()以提取单个线,然后使用explode()拆分的 “逗号空间” 字符串,然后使用array_chunk()存储在一个包含4个元素的5个子阵列中。

该模式的目标行号为grupo3,,然后使用\K重新启动完整匹配,然后贪婪地匹配每个非换行符并在该行的最后一个逗号之前停止。积极lookahead (?=,)不存储完整字符串匹配中的最终逗号。

Pattern Demo

我的方法不保留任何开头和结尾的空格,只值本身。

输出:

array (
    0 => 
    array (
    0 => 'tiago3A', 
    1 => 'bola3A', 
    2 => 'mola3A', 
    3 => 'tijolo3A', 
), 
    1 => 
    array (
    0 => 'pedro3B', 
    1 => 'bola3B', 
    2 => 'mola3B', 
    3 => 'tijolo3B', 
), 
    2 => 
    array (
    0 => 'raimundo3C', 
    1 => 'bola3C', 
    2 => 'mola3C', 
    3 => 'tijolo3C', 
), 
    3 => 
    array (
    0 => 'joao3D', 
    1 => 'bola3D', 
    2 => 'mola3D', 
    3 => 'tijolo3D', 
), 
    4 => 
    array (
    0 => 'felipe3E', 
    1 => 'bola3E', 
    2 => 'mola3E', 
    3 => 'tijolo3E', 
), 
) 

附:如果搜索词($needle)是是动态的,你可以使用这样的事情来达到相同的结果:(PHP Demo

$needle='grupo3'; 
// if the needle may include any regex-sensitive characters, use preg_quote($needle,'/') at $needle 
var_export(preg_match('/^'.$needle.', \K.*(?=,)/m',$text,$out)?array_chunk(explode(', ',$out[0]),4):'fail'); 

/* or this is equivalent... 
    if(preg_match('/^'.$needle.', \K.*(?=,)/m',$text,$out)){ 
     $singles=explode(', ',$out[0]); 
     $groups=array_chunk($singles,4); 
     var_export($groups); 
    }else{ 
     echo 'fail'; 
    } 
*/ 
+0

@AntonioOliveira对不起,我不得不劫持你的问题内容,​​以免它被锁定。现在它已经打开,我为您的问题提供了一些精致的解决方案。如果您对这些技术有任何疑问,请询问。 – mickmackusa