刮缓存页面

python
scrapy
browser-cache

2016-10-14 151 views 0 likes

我以获取一些网页内容，以这种方式使用scrapy：刮缓存页面

class PitchforkTracks(scrapy.Spider): 
    name = "pitchfork_tracks" 
    allowed_domains = ["pitchfork.com"] 
    start_urls = [ 
        "http://pitchfork.com/reviews/best/tracks/?page=1", 
        "http://pitchfork.com/reviews/best/tracks/?page=2", 
        "http://pitchfork.com/reviews/best/tracks/?page=3", 
    ]

一切工作正常。

现在，而不是直接击中页面，我想刮googlecaches的同一页。

什么是合适的syntax来实现？我想试试"cache:http://pitchfork.com/reviews/best/tracks/?page=1",，无济于事。

来源

2016-10-14 data_garden