2012-06-10 35 views
0

提取从字符串IMG源链接我有这个字符串红宝石

#<Fletcher::Model::Amazon alt="You Are Not a Gadget: A Manifesto (Vintage)" border="0" element="img" height="240" id="prodImage" onload="if (typeof uet == 'function') { if(typeof setCSMReq=='function'){setCSMReq('af');setCSMReq('cf');}else{uet('af');uet('cf');amznJQ.completedStage('amznJQ.AboveTheFold');} }" onmouseout="sitb_doHide('bookpopover'); return false;" onmouseover="sitb_showLayer('bookpopover'); return false;" src="http://ecx.images-amazon.com/images/I/51bpl1wA%2BaL._BO2,204,203,200_PIsitb-sticker-arrow-click,TopRight,35,-76_AA240_SH20_OU01_.jpg" width="240"> 

我只是想在src属性的链接:

http://ecx.images-amazon.com/images/I/51bpl1wA%2BaL._BO2,204,203,200_PIsitb-sticker-arrow-click,TopRight,35,-76_AA240_SH20_OU01_.jpg" 

我如何解析这个字符串来获得链接

下面是相关功能的列表

module Fletcher 
    module Model 
    class Amazon < Fletcher::Model::Base 
     # A regular expression for determining if a url comes from a specific service/website 
     def self.regexp 
     /amazon\.com/ 
     end 

     # Parse data and look for object attributes to give to object  
     def parse(data) 
     super(data) 

     case doc 
     when Nokogiri::HTML::Document 
      # Get Name 
      self.name = doc.css("h1.parseasinTitle").first_string 

      # Get Description 
      self.description = doc.css("div#productDescriptionWrapper").first_string  

      # Get description from meta title if not found 
      self.description = doc.xpath("//meta[@name='description']/@content").first_string if description.nil? 

      # Get Price 
      parse_price(doc.css("b.priceLarge").first_string) 

      # Get Images 
      self.images = doc.xpath("//table[@class='productImageGrid']//img").attribute_array 
      self.image = images.first 
     end    
     end 
    end 
    end 
end 
+1

你的字符串看起来很像是调用'Ruby对象上inspect'的输出;你有实际的物体吗? –

+0

@AndrewMarshall我不完全确定你指的是什么“实际”的对象。整个宝石(fletcher)在github上。 https://github.com/hulihanapplications/fletcher。我已经包含了上面的类和方法。 –

+0

那么,你是如何得到这个字符串的? –

回答

1

在这种情况下,我相信这将是:fletchedProduct.image [:SRC]

+0

完美运作 –

1
require 'open-uri' 

x = %Q{#<Fletcher::Model::Amazon alt="You Are Not a Gadget: A Manifesto (Vintage)" border="0" element="img" height="240" id="prodImage" onload="if (typeof uet == 'function') { if(typeof setCSMReq=='function'){setCSMReq('af');setCSMReq('cf');}else{uet('af');uet('cf');amznJQ.completedStage('amznJQ.AboveTheFold');} }" onmouseout="sitb_doHide('bookpopover'); return false;" onmouseover="sitb_showLayer('bookpopover'); return false;" src="http://ecx.images-amazon.com/images/I/51bpl1wA%2BaL._BO2,204,203,200_PIsitb-sticker-arrow-click,TopRight,35,-76_AA240_SH20_OU01_.jpg" width="240">} 

url = URI.extract(x) 

puts url[2] 

输出:

http://ecx.images-amazon.com/images/I/51bpl1wA%2BaL._BO2,204,203,200_PIsitb-sticker-arrow-click,TopRight,35,-76_AA240_SH20_OU01_.jpg 

希望这有助于。我恰巧需要能够在上周做到这一点,并查找它。