为什么Googlebot会从仅限JSON的网址请求HTML？

在这样的页面：https://medstro.com/groups/nejm-group-open-forum/discussions/61 为什么Googlebot会从仅限JSON的网址请求HTML？

我有这样的代码：

$.getJSON("/newsfeeds/61?order=activity&amp;type=discussion", function(response) { 
    $(".discussion-post-stream").replaceWith($(response.newsfeed_html)); 
    $(".stream-posts").before($("<div class=\'newsfeed-sorting-panel generic-12\' data-id=\'61\'>\n<div class=\'newsfeed-type-menu generic-12\'>\n<ul class=\'newsfeed-sorting-buttons\'>\n<li>\n<span>\nShow\n<\/span>\n<\/li>\n<li>\n<select id=\"type\" name=\"type\"><option selected=\"selected\" value=\"discussion\">Show All (15)<\/option>\n<option value=\"discussion_answered\">Answered Questions (15)<\/option>\n<option value=\"discussion_unanswered\">Unanswered Questions (0)<\/option><\/select>\n<\/li>\n<\/ul>\n<\/div>\n<\/div>\n")); 
    Newsfeed.prepare_for_newsfeed_sort($(".newsfeed-sorting-panel")); 
});

Googlebot的决定，它想看看是否有在/newsfeeds/61?order=activity&type=discussion任何有趣的HTML。所以它会尝试抓取那个请求HTML的URL，并且我的应用报告一个错误。 “ActionView :: MissingTemplate：Missing template newsfeeds/show ...”

为什么Googlebot试图抓取此URL？仅仅因为它认为有机会有一些有趣的东西，它试图抓取所有东西？还是因为我的代码有问题？
在Rails中处理这个问题的最好方法是什么？我不想忽略所有MissingTemplate错误，因为可能有些情况会在事件中发出真正错误的信号。同样的事情，忽略机器人创建的错误。我有其他选择吗？

来源

2015-01-03 John Bachir

机器人尝试在您的页面中查找新链接没有任何问题。他们正在做他们的工作。

也许你可以使用这些元标签在你看来之一： Is there a way to make robots ignore certain text?

这些METAS说的Googlebot“不看这里”

<!--googleoff: all--> 

$.getJSON("/newsfeeds/61?order=activity&amp;type=discussion", function(response) { 
$(".discussion-post-stream").replaceWith($(response.newsfeed_html)); 
$(".stream-posts").before($("<div class=\'newsfeed-sorting-panel generic-12\' data-id=\'61\'>\n<div class=\'newsfeed-type-menu generic-12\'>\n<ul class=\'newsfeed-sorting-buttons\'>\n<li>\n<span>\nShow\n<\/span>\n<\/li>\n<li>\n<select id=\"type\" name=\"type\"><option selected=\"selected\" value=\"discussion\">Show All (15)<\/option>\n<option value=\"discussion_answered\">Answered Questions (15)<\/option>\n<option value=\"discussion_unanswered\">Unanswered Questions (0)<\/option><\/select>\n<\/li>\n<\/ul>\n<\/div>\n<\/div>\n")); 
Newsfeed.prepare_for_newsfeed_sort($(".newsfeed-sorting-panel")); 
}); 

<!--googleon: all>

来源

2015-01-03 00:34:46 user3558040

不，它仅适用于GSA：http://webmasters.stackexchange.com/questions/54735 /罐您使用-googleon-和googleoff-评论对防止-的Googlebot从索引-p – Quentin

想必它解析从页面来源，网址，以及只是试图抓取您的网站。

最好告诉Google如何抓取/不抓取您网站的sitemap.xml文件和robots.txt文件。

你可以告诉Googlebot不要抓取这些（或）网页获得的robots.txt参数：

Disallow: /*?

来源

2015-01-03 00:36:56 yolabingo

为什么Googlebot会从仅限JSON的网址请求HTML？

回答

相关问题