-1
我使用HtmlAgilityPack从网站获取HTML,并且来自该网站的请求是由XMLHttpRequest和HTML加载在DIV中我无法获得该HTML加载的外部请求我试过,但我无法得到htmlC#HtmlAgilityPack Html加载外部
HttpWebRequest getRequest = WebRequest.Create(Url) as HttpWebRequest;
//
getRequest.CookieContainer = cookieJar;
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:33.0) Gecko/20100101 Firefox/33.0";
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version11;
getRequest.AllowAutoRedirect = true;
getRequest.ContentType = "application/x-www-form-urlencoded";
Stream newStream1 = getRequest.GetRequestStream();
newStream1.Close();
HttpWebResponse getRequestResponse = (HttpWebResponse)getRequest.GetResponse();
string source = "";
using (StreamReader sr = new StreamReader(getRequestResponse.GetResponseStream(), Encoding.Default))
{
source = sr.ReadToEnd();
//Console.WriteLine(source);
}
doc.LoadHtml(source);
getRequestResponse.Close();
究竟是什么不工作? - 您是否期望它在给定页面上执行JavaScript/AJAX请求?因为'HtmlAgilityPack'没有这样做,它不是一个网页浏览器,它只是解析你把它给DOM的HTML。 - 如果你想做屏幕抓取,我建议用'Selenium'来查看网络浏览器自动化(它应该在Nuget上)。 - 尝试使用Firefox或Chrome开始/调试,但是您应该能够移动到PhantomJS无头浏览器,以避免显示UI /提高性能。 – BrainSlugs83 2016-10-12 22:56:12