2011-06-23 44 views
2

我正在使用一些范围从1-2Gig的文本文件。我不能使用传统的流媒体阅读器,并决定阅读chunck并做我的工作。问题是我不确定什么时候到达文件的末尾,因为它已经在一个文件上工作了很长时间,我不确定我可以通过缓冲区读取多大的文件。这里是代码:阅读大文本文件进行解析

dim Buffer_Size = 30000 
dim bufferread = new [Char](Buffer_Size - 1){} 
dim bytesread as integer = 0 
dim totalbytesread as integer = 0 
dim sb as new stringbuilder 
Do 
    bytesread = inputfile.read(bufferread, 0 , Buffer_Size) 
    sb.append(bufferread) 
    totalbytesread = bytesread + totalbytesread 
    if sb.length > 9999999 then 
     data = sb.tostring 
     if not data is nothing then 
       parsingtools.load(data) 
     endif 
    endif 
    if totalbytesread > 1000000000 then 
     logs.constructlog("File almost done") 
    endif 
loop until inputfile.endofstream 

有没有任何控制或代码,我可以检查多少文件仍然是?

回答

1

你看过BufferedStream吗?

http://msdn.microsoft.com/en-us/library/system.io.bufferedstream%28v=VS.100%29.aspx

你可以用与您的流。另外,我会将缓冲区大小设置为megs,而不是像30,000那么小。

至于剩多少?你可以先问一下它的长度吗?

下面是一段代码片断,我用它来围绕一个流包装一个缓冲流。 (对不起,这是C#)

private static void CopyTo(AzureBlobStore azureBlobStore,Stream src, Stream dest, string description) 
    { 
     if (src == null) 
      throw new ArgumentNullException("src"); 
     if (dest == null) 
      throw new ArgumentNullException("dest"); 

     const int bufferSize = (AzureBlobStore.BufferSizeForStreamTransfers); 
     // buffering happening internally. this is just to avoid 4gig boundary and have something to show 
     int readCount; 
     //long bytesTransfered = 0; 
     var buffer = new byte[bufferSize]; 
     //string totalBytes = FormatBytes(src.Length); 
     while ((readCount = src.Read(buffer, 0, buffer.Length)) != 0) 
     { 
      if (azureBlobStore.CancelProcessing) 
      { 
       break; 
      } 
      dest.Write(buffer, 0, readCount); 
      //bytesTransfered += readCount; 
      //Console.WriteLine("AzureBlobStore:CopyTo:{0}:{1} {2}", FormatBytes(bytesTransfered), totalBytes,description); 
     } 
    } 

希望这会有所帮助。