拆分文本成针

确定较小的部分我想分裂这样一个字符串：拆分文本成针

'This <p>is</p> a <p>string</p>'

我想4串：

这
is
a
string

所以我想找到及其内容一个接一个地分割它。我怎样才能保持相同的序列？

我可以通过该代码获得'This'：$html1 = strstr($html, '<p', true);但我不知道如何继续分割以及如何为具有多根针的可变字符串（至少2个不同的针）执行此操作。你能帮我吗？

来源

2017-07-29 Kristen Joseph-Delaffon

运行难道只为'p'标签？ – revo

如果你只是想要你的'p'标签，你可以使用REGEX捕获'

' – sheplu

来分割你的字符串。我建议你想出一些可以使用捕获组的正则表达式来实现的规则。 – Juan

你可以使用preg_split有一些选项（$s被输入的字符串）：

preg_split("#\s*(<p>.*?</p>)\s*#", $s, 0, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);

这返回一个数组。为您的样品输入返回：

["This", "<p>is</p>", "a", "<p>string</p>"]

看到它在repl.it

来源

2017-07-29 17:31:12 trincot

这是一个很好的解决方案。不知道'preg_split'是如此强大。请注意，由于使用'＃'作为正则表达式的末端括号，因此不需要转义'/'。 – BeetleJuice

谢谢@BeetleJuice。删除了逃生。 – trincot

因为你的针很复杂，你可以使用preg_match_all：

$html = 'This <p>is</p> a <p>string</p>'; 

// Regex to group by paragraph and non-paragraph 
$pattern = '/(.*?)(<p>.+?<\/p>)/'; 

// Parse HTML using the pattern and put result in $matches 
preg_match_all($pattern,$html,$matches, PREG_SET_ORDER); 

// Will contain the final pieces 
$pieces = []; 

// For each $match array, the 0th member is the full match 
// every other member is one of the pieces we want 
foreach($matches as $m) while(next($m)) $pieces[] = trim(current($m)); 

print_r($pieces);// ['This', '<p>is</p>', 'a', '<p>string</p>']

Live demo

来源

2017-07-29 17:01:53 BeetleJuice

拆分文本成针

回答

相关问题