我想从一个网页的HTML源代码中所有图片的URL列表(无论abosulte和相对URL)。我用Jsoup解析HTML,但没有给出所有图像。例如,当我解析google.com HTML源其示出零images..In google.com HTML源图像链接在形式上..提取任何图片,HTML使用Java
"background:url(/intl/en_com/images/srpr/logo1w.png)
而在rediff.com图像链接在形式..
videoArr[j]=new Array("http://ishare.rediff.com/video/entertainment/bappi-da-the-first-indian-in-grammy-jury/2684982","http://datastore.rediff.com/h86-w116/thumb/5E5669666658606D6A6B6272/v3np2zgbla4vdccf.D.0.bappi.jpg","Bappi Da - the first Indian In Grammy jury","http://mypage.rediff.com/profile/getprofile/LehrenTV/12669275","LehrenTV","(2:33)"); j = 1 videoArr[j]=new Array("http://ishare.rediff.com/video/entertainment/bebo-shahid-jab-they-met-again-/2681664","http://datastore.rediff.com/h86-w116/thumb/5E5669666658606D6A6B6272/ra8p9eeig8zy5qvd.D.0.They-Met-Again.jpg","Bebo-Shahid : Jab they met again!","http://mypage.rediff.com/profile/getprofile/LehrenTV/12669275","LehrenTV","(2:17)");
所有图片都是不与“IMG” tags..I也想提取如上面的HTML源不属于即使在“IMG”的标签图像。
我怎样才能做到这一点..?请帮我在这.. 感谢
为什么Java的?你有没有想过开发一个浏览器插件? – fglez 2011-02-04 16:11:33