2016-01-19 108 views

回答

0

我最近实现了Google Search JSON API,根据我的理解,获取网站链接的唯一方法是通过JSON回调,其中每个结果都包含formattedUrl或htmlFormattedUrl。查询将是有问题的网站,并希望第一个结果会给你网站的相关链接。

但是,如果我正确理解你的问题,你想要废除给定网站的子链接,这是web crawler会做的事情。如果你是网站的所有者,你可以在网络上使用许多工具创建一个网站地图,但是如果你的意图可以被归类为“其他”,那么我相信你会在错误的树上咆哮。请参阅question,它将指出您创建一个简单的WebCrawler。

//示例customsearch#查询结果中的结果项目Deovandski

"items": [ 
    { 
    "kind": "customsearch#result", 
    "title": "Student Experience - College of Science and Mathematics (NDSU)", 
    "htmlTitle": "Student Experience - College of Science and Mathematics (NDSU)", 
    "link": "https://www.ndsu.edu/scimath/currentstudents/student_experience/", 
    "displayLink": "www.ndsu.edu", 
    "snippet": "Sep 16, 2015 ... Association for Computing Machinery Student Chapter Chair: Jordan Goetze \nAdvisor: Brian Slator. Upsilon Pi Epsilon President: Deovandski ...", 
    "htmlSnippet": "Sep 16, 2015 \u003cb\u003e...\u003c/b\u003e Association for Computing Machinery Student Chapter Chair: Jordan Goetze \u003cbr\u003e\nAdvisor: Brian Slator. Upsilon Pi Epsilon President: \u003cb\u003eDeovandski\u003c/b\u003e ...", 
    "cacheId": "pyzF9XJwrXsJ", 
    "formattedUrl": "https://www.ndsu.edu/scimath/currentstudents/student_experience/", 
    "htmlFormattedUrl": "https://www.ndsu.edu/scimath/currentstudents/student_experience/", 
    "pagemap": { 
    "cse_image": [ 
    { 
     "src": "https://www.ndsu.edu/fileadmin/_processed_/csm_080117_anatomy_03med_9dbc3c8cce.jpg" 
    } 
    ], 
    "cse_thumbnail": [ 
    { 
     "width": "184", 
     "height": "275", 
     "src": "https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcTTL-GZRfSv30cyESsCnd_65BFoLMDdo8fqNS58mHfRbGiOTjSq-e-o28FE" 
    } 
    ] 
    } 
    }, 
相关问题