如何提取图像的src网址，省略“？”和查询字符串？

我用下面的正则表达式得到IMG-代码网址：如何提取图像的src网址，省略“？”和查询字符串？

$output = preg_match_all('/<img.+src=[\'"]([^\'"]+)[\'"].*>/i', $post_single->post_content, $matches);

然而，$匹配给我下面的结果：

http://example.com/wp-content/uploads/2013/11/dsc_842.jpg

- >这是好的。

http://example.com/wp-content/uploads/2013/11/dsc_0546.jpg?w=640

- >这不好。

如何更改正则表达式以防止结果中包含?w=640的情况？

非常感谢帮助。

谢谢！

来源

2016-02-05 Torben

''/

只要把我的两分钱：因为你使用Wordpress（在PHP），你可以使用函数[''parse_url（）']（http://php.net/manual/en/function .parse的URL。PHP），它能够立即处理您的所有需求。 – Jan

很容易使这样的：

$output = preg_match_all('/<img.+src=[\'"]([^\'"?]+)[\'"?].*>/i', $post_single->post_content, $matches);

这样([^\'"?]+)[\'"?]第一相匹配的报价和问号旁边的任何东西，然后需要一个。

例如：https://regex101.com/r/yJ1yA1/1

来源

2016-02-05 08:32:56 B8vrede

我想提出另一种方法（使用XPath和parse_url()）：

$xml = simplexml_load_string($your_html_here); 
$images = $xml->xpath("//img/@src"); 
foreach ($images as $image) { 
    $parsed = parse_url($image); 
    print_r($parsed); 
}

来源

2016-02-05 08:59:57 Jan

您还可以使用正则表达式：

$string='<img src="path/to/image/file.jpg">'; 
preg_match('/(?:\<img[\s].*?src=)(?:\"|\')(.*)?(?:\'|\")/',$string,$matches);

$ matches [1]会给你确切的srcimg标记的属性，无论您在img标记中拥有多少属性都无关紧要。

从逻辑上讲，您可以拥有的其他选项是在'='上展开（PHP），然后尝试查找src属性，这可能是更好的选择。

来源

2016-02-05 10:41:36

其他正则表达式解决方案不必要地匹配整条线或使用次优模式语法。这是你要找到最小/最有效的正则表达式：

<img.*?src=['"]\K[^\'"?]+

（Pattern Demo Link）

它还使用没有捕捉组，所以preg_match_all()的输出数组会小50％/精简。

代码（Demo）：

$wp_post_content='<img src="http://example.com/wp-content/uploads/2013/11/dsc_0546.jpg?w=640"> 
<img src="http://example.com/wp-content/uploads/2013/12/dsc_0547.jpg?w=1080">'; 

var_export(preg_match_all('/<img.*?src=[\'"]\K[^\'"?]+/i',$wp_post_content,$out)?$out[0]:[]);

输出：

array (
    0 => 'http://example.com/wp-content/uploads/2013/11/dsc_0546.jpg', 
    1 => 'http://example.com/wp-content/uploads/2013/12/dsc_0547.jpg', 
)

来源

2017-05-31 08:37:56 mickmackusa

如何提取图像的src网址，省略“？”和查询字符串？

回答

相关问题