阅读HTML数据中的数据

我已经使用webclient类从网站下载了HTML数据。现在我想读取标签之间的数据。我开始了解htmlagilitypack，但我不想使用它。我正在使用下面的代码来获取HTML数据。阅读HTML数据中的数据

WebClient client = new WebClient(); 
     string url = "XXXXXXXXXXXXX" 
     Byte[] requestedHTML; 
     requestedHTML = client.DownloadData(url); 
     string htmlcode = client.DownloadString(url); 

     //client.DownloadFile(url, @"E:\test.html"); 

     UTF8Encoding objUTF8 = new UTF8Encoding(); 
     string html = objUTF8.GetString(requestedHTML); 
     Response.Write(html);

来源

2011-06-03 karthik k

为什么你不想使用HTMLAgilityPack？ – DuckMaestro 2011-06-03 07:00:51

@ Muad'Dib得到了一个好点46个问题提出，只有三分之一接受是有点低... – Ivo 2011-06-03 07:06:25

好吧，我会这样做，只要我得到空闲时间..现在可以有人给这个问题的答案？ – 2011-06-03 07:12:08

试试这个：

 WebClient client = new WebClient(); 
     string url = "Your URL"; 
     Byte[] requestedHTML; 
     requestedHTML = client.DownloadData(url); 
     string htmlcode = client.DownloadString(url); 

     //client.DownloadFile(url, @"E:\test.html"); 

     UTF8Encoding objUTF8 = new UTF8Encoding(); 
     string html = objUTF8.GetString(requestedHTML);   


     MatchCollection m1 = Regex.Matches(html, @"(<h3>(.*?)</h3>)", 
     RegexOptions.Singleline); 

     foreach (Match m in m1) 
     { 
      string cell = m.Groups[1].Value; 
      Match match = Regex.Match(cell, @"<h3>(.+?)</h3>"); 
      if (match.Success) 
      { 
       string value = match.Groups[1].Value; 
      } 
     }

的字符串值，会给你的价值= “芝加哥”

来源

2011-06-03 07:20:51 BreakHead

Html数据包含表格标签。在那个很多存在。其中一个包含我想要检索的数据。这是我想知道的事情。 – 2011-06-03 07:31:03

你可以给你想要读取数据的URL以及你想从一行中读取哪些特定数据？ – BreakHead 2011-06-03 07:45:22

URL：http://zipinfo.com/cgi-local/zipsrch.exe?zip=60680 ..在这里，我将邮政编码传递到网站并获取相关数据。 HTML数据包含给定Zipcode的城市名称，这里的城市名称是芝加哥（您可以在html数据中看到它）。这是我想要检索的值。 – 2011-06-03 07:48:28

使用正则表达式来代替。

来源

2011-06-03 07:52:48 Burimi

http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html – VMAtm 2011-06-03 07:59:59

巴迪一票从我身边投票..你是正确的正则表达式是快速.. – BreakHead 2011-06-03 08:33:25

阅读HTML数据中的数据

回答

相关问题