2012-10-13 70 views
0

我想将1000000个文档插入到RavenDB中。将1000000个文档插入到RavenDB中

class Program 
{ 
     private static string serverName; 
     private static string databaseName; 

     private static DocumentStore documentstore; 
     private static IDocumentSession _session; 

     static void Main(string[] args) 
     { 

      Console.WriteLine("Start..."); 

      serverName = ConfigurationManager.AppSettings["ServerName"]; 
      databaseName = ConfigurationManager.AppSettings["Database"]; 

      documentstore = new DocumentStore { Url = serverName }; 
      documentstore.Initialize(); 

      Console.WriteLine("Initial Databse..."); 

      _session = documentstore.OpenSession(databaseName); 

      for (int i = 0; i < 1000000; i++) 
      { 
       var person = new Person()  
       { 
        Fname = "Meysam" + i, 
        Lname = " Savameri" + i, 
        Bdate = DateTime.Now, 
        Salary = 6001 + i, 
        Address = "BITS provides one foreground and three background priority levels that" + 
           "you can use to prioritize transBfer jobs. Higher priority jobs preempt"+ 
           "lower priority jobs. Jobs at the same priority level share transfer time,"+ 
           "which prevents a large job from blocking small jobs in the transfer"+ 
           "queue. Lower priority jobs do not receive transfer time until all the "+ 
           "higher priority jobs are complete or in an error state. Background"+ 
           "transfers are optimal because BITS uses idle network bandwidth to"+ 
           "transfer the files. BITS increases or decreases the rate at which files "+ 
           "are transferred based on the amount of idle network bandwidth that is"+ 
           "available. If a network application begins to consume more bandwidth,"+ 
           "BITS decreases its transfer rate to preserve the user's interactive"+ 
           "experience. BITS supports multiple foreground jobs and one background"+ 
           "transfer job at the same time.", 
        Email = "Meysam" + i + "@hotmail.com", 
       }; 

       _session.Store(person); 

       Console.ForegroundColor = ConsoleColor.Green; 
       Console.WriteLine("Count:" + i); 
       Console.ForegroundColor = ConsoleColor.White; 
      } 

      Console.WriteLine("Commit..."); 

      _session.SaveChanges(); 
      documentstore.Dispose(); 

      _session.Dispose(); 

      Console.WriteLine("Complete..."); 
      Console.ReadLine(); 
     } 
    } 

,但会议并没有保存更改,我得到一个错误:

An unhandled exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll

+1

** OutOfMemory **的哪个部分是您**不了解的?!?!?您需要以较小的批次加载文档(例如,每次1'000) –

回答

8

一个document session旨在处理少量请求。相反,尝试插入1024批次。之后,处理会话并创建一个新会话。您得到OutOfMemoryException的原因是因为文档会话会缓存所有组成对象以提供unit of work,这就是为什么您应该在插入批处理后处理会话的原因。

一个巧妙的办法来做到这一点是与使用Batch linq extension的:

foreach (var batch in Enumerable.Range(1, 1000000) 
.Select(i => new Person { /* set properties */ }) 
.Batch(1024)) 
{ 
using (var session = documentstore.OpenSession()) 
{ 
    foreach (var person in batch) 
    { 
    session.Store(person); 
    } 
    session.SaveChanges(); 
} 
} 

两个Enumerable.RangeBatch的实现是懒惰,不要让所有的对象在内存中。

+1

感谢您指出morelinq的.Batch()扩展。这是一个非常有用的技术! –

1

RavenDB也有一个bulk API,做了类似的事情,而不需要额外的LINQ扩展:

using (var bulkInsert = store.BulkInsert()) 
{ 
    for (int i = 0; i < 1000 * 1000; i++) 
    { 
     bulkInsert.Store(new User 
      { 
       Name = "Users #" + i 
      }); 
    } 
} 

注意.SaveChanges()不叫,将被称为到达任一批次大小时(在规定BulkInsert(),如果需要的话),或者当bulkInsert被处置。