2010-06-30 35 views
3

我有一个字符串,如下列:PHP正则表达式帮助解析字符串

Are you looking for a quality real estate company? 

<s>Josh's real estate firm specializes in helping people find homes from   
[city][State].</s> 

<s>Josh's real estate company is a boutique real estate firm serving clients 
locally.</s> 

In [city][state] I am sure you know how difficult it is 
to find a great home, but we work closely with you to give you exactly 
what you need 

我想有这一段分成基础上,<s> </s>标签的数组,所以我有以下数组作为结果:

[0] Are you looking for a quality real estate company? 
[1] Josh's real estate firm 
    specializes in helping people find homes from [city][State]. 
[2] Josh's real estate company is a boutique real estate firm serving clients 
    locally. 
[3] In [city][state] I am sure you know how difficult it is 
    to find a great home, but we work closely with you to give you exactly 
    what you need 

这是我目前使用正则表达式:

$matches = array(); 
preg_match_all(":<s>(.*?)</s>:is", $string, $matches); 
$result = $matches[1]; 
print_r($result); 

但这个只返回一个包含<s> </s>标签之间的文本的数组,它忽略了在这些标签之前和之后发现的文本。 (在上面的例子中,将只返回数组元素1和2

任何想法

回答

2

我可以用preg_split()而不是得到最接近?

$string = <<< STR 
Are you looking for a quality real estate company? <s>Josh's real estate firm 
specializes in helping people find homes from [city][State].</s> 
<s>Josh's real estate company is a boutique real estate firm serving clients 
locally.</s> In [city][state] I am sure you know how difficult it is 
to find a great home, but we work closely with you to give you exactly 
what you need 
STR; 

print_r(preg_split(':</?s>:is', $string)); 

,并得到这样的输出:

Array 
(
    [0] => Are you looking for a quality real estate company? 
    [1] => Josh's real estate firm 
specializes in helping people find homes from [city][State]. 
    [2] => 

    [3] => Josh's real estate company is a boutique real estate firm serving clients 
locally. 
    [4] => In [city][state] I am sure you know how difficult it is 
to find a great home, but we work closely with you to give you exactly 
what you need 
) 

除了产生一个额外的数组元素(索引2)那里的所述片段之间[city][State].</s>换行符和<s>Josh's real estate company

虽然添加一些代码来删除空白匹配将是微不足道的,但我不确定是否需要这样做。

+0

额外的数组元素是好的,但它似乎在寻找的只是'',这意味着像'我的名字是鲍勃。 im 17。'和'我的名字是鲍勃。 im 17'会被分成2个元素,它是否可以改变,所以第一个例子只保存在1个数组元素中? (我希望未打开的''不匹配)。 – 2010-06-30 06:07:46

+0

此外,如果可以删除空的数组元素,那么我更喜欢它。 – 2010-06-30 06:24:10

+0

我会用我的代码拨弄一下,然后更新我的答案,如果我只能匹配正确打开和关闭的标签,并删除空的元素。 – BoltClock 2010-06-30 06:27:22