从php中的文章中获取图像使用php

我正在编辑一个插件，我使用它将meta打开图标签添加到标题中。它的问题在于，它只会让我选择对整个网站的一张图片..这是我做了什么：从php中的文章中获取图像使用php

preg_match_all('/<img .*?(?=src)src=\"([^\"]+)\"/si', $hdog_base, $image); 

if (strlen($hdog_base) <= 25) 
{ 
    if (substr($image[0], 0, 4) != 'http') 
    { 
     $image[0] = JURI::base().$image[0]; 
    } 
    $hdog_image_tmp = $image[0]; 
} 
else 
{ 
    if (substr($image[1], 0, 4) != 'http') 
    { 
     $image[1] = JURI::base().$image[1]; 
    } 
    $hdog_image_tmp = $image[1]; 
} 
$hdog_image = '<meta property="og:image" content="'.$hdog_image_tmp.'" /> 
';

$ hdog_base是当前网页我在。第一个if语句会显示第一张图片，这是图标（用于前页主页），其他图片会显示第二张图片（每张图片上的图片都不相同），但结果只会显示为这个，不管我是在主页上还是在网站上的其他地方：

<meta property="og:image" content="http://mysite.com/Array" />

有什么建议吗？

由于提前，

更新：我正在做的最大的错误是，我试图找到图像的URL，而不是实际的网页。但只是链接。那么，我将如何继续获取当前页面的内容？而不是$ hdog_base，这不过是一个链接。

更新，解决了：

我用

$buffer = JResponse::getBody();

得到网页中的HTML

，然后DOM为休息

$doc = new DOMDocument(); 
@$doc->loadHTML($buffer); 

$images = $doc->getElementsByTagName('img'); 
if (strlen($hdog_base) <= 26) 
{ 
    $image = $images->item(0)->getAttribute('src'); 
} 
else 
{ 
    $image = $images->item(1)->getAttribute('src'); 
} 
if (substr($image, 0, 4) != 'http') $image = JURI::base().$image; 
$hdog_image = '<meta property="og:image" content="'.$image.'" /> 
';

非常感谢cpilko为您的帮助！ :)

来源

2012-10-17 indiqa

在正则表达式中使用具有多个子模式的preg_match_all将返回多维数组。在你的代码中$image[n]是一个数组。如果您在php中将数组作为字符串进行投射，则会返回文本Array。

编辑：使用正则表达式来解析HTML并不理想。你最好与DOMDocument做：

$doc = new DOMDocument(); 
@$doc->loadHTML($hdog_base); 

$images = $doc->getElementsByTagName('img'); 
if (strlen($hdog_base) <= 25) { 
    $image = $images->item(0)->getAttribute('src'); 
} else { 
    $image = $images->item(1)->getAttribute('src'); 
} 
if (substr($image[0], 0, 4) != 'http') $image .= JURI::base(); 
$hdog_image = '<meta property="og:image" content="'.$hdog_image_tmp.'" /> 
';

来源

2012-10-17 17:52:58 cpilko

其结果是这样的：<！ - 阵列（ [0] =>'\t阵列（） [1] =>数组（）） - >' – indiqa

你的正则表达式不匹配任何东西。您可以在像这样的在线正则表达式测试中对此进行疑难解答：http://www.regextester.com/ – cpilko

在进行更多的研究时，正则表达式是该工作的错误工具。你应该使用'DOMDocument'。查看SO问题的第二和第三个答案的详细信息http://stackoverflow.com/questions/138313/how-to-extract-img-src-title-and-alt-from-html-using-php – cpilko

从php中的文章中获取图像使用php

回答

相关问题