2013-05-18 25 views
1

我正在使用WordPress。其中一个文件functions.php包含阻止Google抓取的function do_robots() {...。我已经取代了此功能有以下:如何在WordPress上创建Google Bot友好的robot.txt?

function do_robots() { 
    header('Content-Type: text/plain; charset=utf-8'); 

    do_action('do_robotstxt'); 

    if ('0' == get_option('blog_public')) { 
        echo "User-agent: *"; 
        echo "\nDisallow: /wp-admin"; 
        echo "\nDisallow: /wp-includes"; 
        echo "\nDisallow: /wp-content"; 
        echo "\nDisallow: /stylesheets"; 
        echo "\nDisallow: /_db_backups"; 
        echo "\nDisallow: /cgi"; 
        echo "\nDisallow: /store"; 
        echo "\nDisallow: /wp-includes\n"; 
    } else { 
        echo "User-agent: *"; 
        echo "\nDisallow: /wp-admin"; 
        echo "\nDisallow: /wp-includes"; 
        echo "\nDisallow: /wp-content"; 
        echo "\nDisallow: /stylesheets"; 
        echo "\nDisallow: /_db_backups"; 
        echo "\nDisallow: /cgi"; 
        echo "\nDisallow: /store"; 
        echo "\nDisallow: /wp-includes\n"; 
    } 
} 
  1. 我不太肯定Allow。难道只要我不Disallow,默认是Allow
  2. 为什么Google Bot仍然被上述function阻挡?

回答

1

original function出SVN看起来像是阻塞较少的路径比你上面的例子,所以我会建议删除一些额外的目录(如的wp-content),看是否是你在找什么。您也可以尝试使用WordPress plugin生成Google Sitemap以供其引擎读取。

function do_robots() { 
    header('Content-Type: text/plain; charset=utf-8'); 

    do_action('do_robotstxt'); 

    $output = "User-agent: *\n"; 
    $public = get_option('blog_public'); 
    if ('0' == $public) { 
     $output .= "Disallow: /\n"; 
    } else { 
     $site_url = parse_url(site_url()); 
     $path = (!empty($site_url['path'])) ? $site_url['path'] : ''; 
     $output .= "Disallow: $path/wp-admin/\n"; 
     $output .= "Disallow: $path/wp-includes/\n"; 
    } 

    echo apply_filters('robots_txt', $output, $public); 
} 

的规则robots.txt文件是,一切都被允许,除非指定,虽然搜索引擎服从robots.txt更多的是一种信托制度比什么的。