C＃通用列表foreach OutofMemoryException

我有一个程序，从数据库中读取约2百万行到列表。每一行都是包含诸如地理坐标等信息的位置。C＃通用列表foreach OutofMemoryException

一旦数据被添加到我使用foreach循环，抓住坐标创建KML文件列表。当行数很大时，循环会遇到OutOfMemoryException错误（但在其他情况下完美工作）。

如何处理这使程序能够以非常大的数据集的工作有什么建议？ kml库是SharpKML。

我还是C＃的新手，所以请放轻松！

这是循环：

  using (SqlConnection conn = new SqlConnection(connstring)) 
     { 
      conn.Open(); 
      SqlCommand cmd = new SqlCommand(select, conn); 

      using (cmd) 
      { 
       SqlDataReader reader = cmd.ExecuteReader(); 
       while (reader.Read()) 
       { 
        double lat = reader.GetDouble(1); 
        double lon = reader.GetDouble(2); 
        string country = reader.GetString(3); 
        string county = reader.GetString(4); 
        double TIV = reader.GetDouble(5); 
        double cnpshare = reader.GetDouble(6); 
        double locshare = reader.GetDouble(7); 

        //Add results to list 
        results.Add(new data(lat, lon, country, county, TIV, cnpshare, locshare)); 
       } 
       reader.Close(); 
      } 
      conn.Close(); 
     } 

      int count = results.Count(); 
      Console.WriteLine("number of rows in results = " + count.ToString()); 

      //This code segment generates the kml point plot 

      Document doc = new Document(); 
      try 
      { 
       foreach (data l in results) 
       { 
        Point point = new Point(); 
        point.Coordinate = new Vector(l.lat, l.lon); 

        Placemark placemark = new Placemark(); 
        placemark.Geometry = point; 
        placemark.Name = Convert.ToString(l.tiv); 

        doc.AddFeature(placemark); 

       } 
      } 
      catch(OutOfMemoryException e) 
      { 
       throw e; 
      }

这是在列表

 public class data 
    { 
     public double lat { get; set; } 
     public double lon { get; set; } 
     public string country { get; set; } 
     public string county { get; set; } 
     public double tiv { get; set; } 
     public double cnpshare { get; set; } 
     public double locshare { get; set; } 

     public data(double lat, double lon, string country, string county, double tiv, double cnpshare, 
      double locshare) 
     { 
      this.lat = lat; 
      this.lon = lon; 
      this.country = country; 
      this.county = county; 
      this.tiv = tiv; 
      this.cnpshare = cnpshare; 
      this.locshare = locshare; 
     } 

    }

来源

2012-06-12 Richard Todd

为什么地球上你需要所有200万行内存？ – vcsjones

如果在与数据库中的数据填充列表中没有大的延迟，你没有用的数据，为什么不立即创建点和地标对象填充列表中提及的问题。代码如下。

var doc = new Document(); 

    using (SqlConnection conn = new SqlConnection(connstring)) 
    { 
     conn.Open(); 
     SqlCommand cmd = new SqlCommand(select, conn); 

     using (cmd) 
     { 
      var reader = cmd.ExecuteReader(); 
      while (reader.Read()) 
      { 
       double lat = reader.GetDouble(1); 
       double lon = reader.GetDouble(2); 
       string country = reader.GetString(3); 
       string county = reader.GetString(4); 
       double TIV = reader.GetDouble(5); 
       double cnpshare = reader.GetDouble(6); 
       double locshare = reader.GetDouble(7); 

       var point = new Point(); 
       point.Coordinate = new Vector(lat , lon); 

       var placemark = new Placemark(); 
       placemark.Geometry = point; 
       placemark.Name = Convert.ToString(TIV); 

       doc.AddFeature(placemark); 

      reader.Close(); 
     } 
     conn.Close(); 
    }

如果没有很好的理由在内存中检索这么多的数据，请尝试一些延迟加载的方法。

来源

2012-06-12 16:46:40 Minja

非常感谢您的建议。我理想地需要查询中的其他字段，因为这些字段在某些时候会用于标记KML点或执行其他KML操作，例如多边形。今晚我会花几个小时回顾一下这里的建议并回报。谢谢大家。 –

给了这个去。读者似乎就像流媒体一样工作。但是，当我尝试写入KML文件时，现在会发生OutOfMemoryException。这将是一个非常大的KML文件（例如50MB），但仍然不足以应付OutOfMemoryException。还有另一种更有效地实现某种流的方法吗？也许我需要分割这些文件，然后再加入。 –

你真的需要这么大的KML文件吗？您可以尝试使用@drdigit解决方案http://stackoverflow.com/a/11001300/1433917，在每次迭代中向现有的KML文件追加1000行。然而，从我的地理信息系统知识来看，具有50MB大小的KML文件可能会因Google地图或Google地球的处理速度缓慢，为什么不通过一些规则将其拆分成几个较小的KML文件，这会减少您的查询执行。 – Minja

为什么你需要写它之前所有的数据存储uused类？不要将每行添加到列表中，您应该在读取每行时对其进行处理，然后将其忽略。

例如，尝试一起滚动你这样的代码：

Document doc = new Document(); 
while (reader.Read()) 
{ 
    // read from db 
    double lat = reader.GetDouble(1); 
    double lon = reader.GetDouble(2); 
    string country = reader.GetString(3); 
    string county = reader.GetString(4); 
    double TIV = reader.GetDouble(5); 
    double cnpshare = reader.GetDouble(6); 
    double locshare = reader.GetDouble(7); 

    var currentData = new data(lat, lon, country, county, TIV, cnpshare, locshare)); 

    // write to file 
    Point point = new Point(); 
    point.Coordinate = new Vector(currentData.lat, currentData.lon); 

    Placemark placemark = new Placemark(); 
    placemark.Geometry = point; 
    placemark.Name = Convert.ToString(currentData.tiv); 

    doc.AddFeature(placemark); 
}

如果Document实现理智虽然这只会工作。

来源

2012-06-12 16:36:21 Oliver

好主意。我会放弃它。 –

奥利弗是对的（来自我的投票）。性能明智，你可以做一些其他的东西。首先不要查询你不打算使用的字段。然后在while语句（？）之前移动所有变量声明（Oliver的代码）。最后，不要等待你的sql server收集并发回所有记录，而是逐步地逐步完成。例如，如果你的记录有一个UID，并且它的命令是这个UID，那么从一个本地C＃变量“var lastID = 0”开始，将你的选择语句改为（pre-format）之类的东西“select top 1000 ...其中UID> lastID“并重复您的查询，直到您得到任何内容或任何内容少于1000条记录。

来源

2012-06-12 16:59:48

@drdigit，

我会避免在循环执行查询。一个查询应该始终返回当时所需的数据。在这种情况下，您将有1000个查询返回1000行。或许是为了快速显示第1000行更好，但我不知道这是否会更快，如果你执行环路1000次更快的查询，而不是只是执行一个query.Maybe我错了....

我认为如果在这种情况下有需要，你的方法对于延迟加载是有好处的。

来源

2012-06-12 17:17:50 Minja

数字1000只是一个例子。在大多数情况下（如果db索引是适合于查询的），只要查询在同一连接下执行，性能差异就会非常大。可能影响性能的一个因素是网络等待时间，但它可以根据样本数量1000来平衡（达到某一点）的往返次数。无论如何，在这种情况下，我们似乎谈论了一个本地主机环境意味着可能没有网络延迟。 – 2012-06-12 17:30:46

糟糕。我很抱歉没有及时看到你的第一个答案。这是正确的，并与Olivers完全一致。所以对你来说是一个好消息，因为这不是最快打字“机器”的竞争。 – 2012-06-12 17:38:03

我对数据库并不擅长，因为我主要面向程序设计，并没有像这样的情况来测试性能问题。我的答案指出良好的（必须有）编程实践 - 永远不要把查询放在循环内。但是，如果我们谈论良好的编程习惯，肯定不会查询200万行。 – Minja

C＃通用列表foreach OutofMemoryException

回答

相关问题