WebClient不能在一个网站上工作

-1

我想下载一个网站（黄页）的html代码到我的C＃winforms应用程序中的字符串。WebClient不能在一个网站上工作

我不断从一个网站收到相同的错误。所有其他网站的工作，我已经试过了通用的像：http://www.google.co.za和它的作品，但是当我尝试使用http://www.yellowpages.co.za它抛出：

An unhandled exception of type 'System.Net.WebException' occurred in System.dll Additional information: The remote server returned an error: (500) Internal Server Error.

我不知道为什么只有这一个网站抛出此错误。

请在下面找到

private string getPage() 
{ 
    using (WebClient client = new WebClient()) 
    { 
     return client.DownloadString("http://www.yellowpages.co.za/");    
    } 
}

来源

2015-01-05 msbarnard

粗略猜测，他们正在进行某种验证，以避免人们使用超出其使用条款的数据。如果你使用像[Fiddler]（http://www.fiddlertool.com/） –

这样的HTTP调试器，你自己可能会看到这个。难道没有办法解决这个问题吗？如果我可以在Chrome中查看和下载页面源代码，为什么我不能在C＃中这样做？ – msbarnard

@ user3815511：Chrome正在发送更多元数据，如浏览器类型，支持的压缩机制以及请求的源IP地址。这有些方法可以让Web服务器更加确保请求来自浏览器而不是刮板。上面的代码只是为了让页面没有任何关于发起客户端的内容;例如你没有下载CSS或图像 - 你可能会从网站上下载材料。一些网站使用防火墙阻止这样的请求;这是实施DoS攻击的常用方法。 –

我的代码添加user-agent头修复了这个。

private string getPage() 
{ 
    using (WebClient client = new WebClient()) 
    { 
      client.Headers.Add("user-agent", "foo"); 
      return client.DownloadString("http://www.yellowpages.co.za/");    
    } 
}

这就是说，我会放一个有效的价值user-agent，而不是一个占位符像foo。有关用户代理的详细信息，请参阅rfc2616。

来源

2015-01-05 13:59:43

WebClient不能在一个网站上工作

回答

相关问题