2015-12-22 99 views
0

我想读取两个文件,加入它们的内容,然后GZIP它,以便我可以将它发送到服务器。Android - 阅读巨大(30MB)文本文件并压缩它们

当文件是几个MB,但其中一个重达〜30MB(生产预期大小)时,一切正常,但试图读取它时出现Out of memory on a 43628012-byte allocation错误。我不知道我做错了什么,因为它在小文件上效果很好。

接下来是我用来读取文本文件的代码:

private String getTextFromFile(File fileName) { 
     StringBuilder logsHolder = new StringBuilder(); 
     BufferedReader input; 
     try { 
      input = new BufferedReader(new FileReader(fileName)); 
      String line = null; 
      String lineSeparator = System.getProperty("line.separator"); 
      while ((line = input.readLine()) != null){ 
       logsHolder.append(line); 
       logsHolder.append(lineSeparator); 
      } 
      input.close(); 
     } catch (IOException e) { 
      e.printStackTrace(); 
     } 
     return logsHolder.toString(); 
    } 

的错误读线的几千后的行logsHolder.append(line);启动。接下来是logcat的输出:

  01-04 09:54:25.852: D/dalvikvm(888): GC_FOR_ALLOC freed 1223K, 29% free 6002K/8364K, paused 21ms, total 21ms 
      01-04 09:54:25.892: D/dalvikvm(888): GC_FOR_ALLOC freed 1022K, 30% free 6235K/8860K, paused 16ms, total 17ms 
      01-04 09:54:25.932: D/dalvikvm(888): GC_FOR_ALLOC freed 884K, 27% free 6481K/8860K, paused 18ms, total 19ms 
      01-04 09:54:25.932: I/dalvikvm-heap(888): Grow heap (frag case) to 8.521MB for 1134874-byte allocation 
      01-04 09:54:25.952: D/dalvikvm(888): GC_FOR_ALLOC freed 738K, 32% free 6851K/9972K, paused 18ms, total 18ms 
      01-04 09:54:26.012: D/dalvikvm(888): GC_FOR_ALLOC freed 586K, 32% free 6851K/9972K, paused 18ms, total 18ms 
      01-04 09:54:26.012: I/dalvikvm-heap(888): Grow heap (frag case) to 9.422MB for 1702306-byte allocation 
      01-04 09:54:26.042: D/dalvikvm(888): GC_FOR_ALLOC freed 1108K, 37% free 7405K/11636K, paused 20ms, total 20ms 
      01-04 09:54:26.122: D/dalvikvm(888): GC_FOR_ALLOC freed 878K, 37% free 7405K/11636K, paused 21ms, total 21ms 
      01-04 09:54:26.122: I/dalvikvm-heap(888): Grow heap (frag case) to 10.776MB for 2553454-byte allocation 
      01-04 09:54:26.152: D/dalvikvm(888): GC_CONCURRENT freed 0K, 30% free 9899K/14132K, paused 10ms+2ms, total 27ms 
      01-04 09:54:26.152: D/dalvikvm(888): WAIT_FOR_CONCURRENT_GC blocked 17ms 
      01-04 09:54:26.242: D/dalvikvm(888): GC_FOR_ALLOC freed 2980K, 42% free 8236K/14132K, paused 16ms, total 16ms 
      01-04 09:54:26.252: I/dalvikvm-heap(888): Grow heap (frag case) to 12.805MB for 3830176-byte allocation 
      01-04 09:54:26.282: D/dalvikvm(888): GC_CONCURRENT freed 0K, 33% free 11977K/17876K, paused 10ms+3ms, total 27ms 
      01-04 09:54:26.282: D/dalvikvm(888): WAIT_FOR_CONCURRENT_GC blocked 8ms 
      01-04 09:54:26.432: D/dalvikvm(888): GC_FOR_ALLOC freed 4470K, 47% free 9483K/17876K, paused 17ms, total 17ms 
      01-04 09:54:26.442: I/dalvikvm-heap(888): Grow heap (frag case) to 15.849MB for 5745260-byte allocation 
      01-04 09:54:26.472: D/dalvikvm(888): GC_CONCURRENT freed 0K, 36% free 15094K/23488K, paused 17ms+2ms, total 33ms 
      01-04 09:54:26.472: D/dalvikvm(888): WAIT_FOR_CONCURRENT_GC blocked 15ms 
      01-04 09:54:26.663: D/dalvikvm(888): GC_FOR_ALLOC freed 6704K, 52% free 11353K/23488K, paused 20ms, total 20ms 
      01-04 09:54:26.683: I/dalvikvm-heap(888): Grow heap (frag case) to 20.415MB for 8617886-byte allocation 
      01-04 09:54:26.713: D/dalvikvm(888): GC_CONCURRENT freed 0K, 39% free 19769K/31904K, paused 17ms+2ms, total 32ms 
      01-04 09:54:26.713: D/dalvikvm(888): WAIT_FOR_CONCURRENT_GC blocked 14ms 
      01-04 09:54:27.033: D/dalvikvm(888): GC_FOR_ALLOC freed 10057K, 56% free 14158K/31904K, paused 31ms, total 31ms 
      01-04 09:54:27.053: I/dalvikvm-heap(888): Grow heap (frag case) to 27.264MB for 12926824-byte allocation 
      01-04 09:54:27.093: D/dalvikvm(888): GC_CONCURRENT freed 8415K, 59% free 18366K/44528K, paused 17ms+2ms, total 32ms 
      01-04 09:54:27.093: D/dalvikvm(888): WAIT_FOR_CONCURRENT_GC blocked 15ms 
      01-04 09:54:27.333: D/dalvikvm(888): GC_CONCURRENT freed 4324K, 59% free 18367K/44528K, paused 1ms+3ms, total 29ms 
      01-04 09:54:27.333: D/dalvikvm(888): WAIT_FOR_CONCURRENT_GC blocked 22ms 
      01-04 09:54:27.493: D/dalvikvm(888): GC_FOR_ALLOC freed 2345K, 59% free 18366K/44528K, paused 19ms, total 19ms 
      01-04 09:54:27.513: I/dalvikvm-heap(888): Grow heap (frag case) to 37.537MB for 19390232-byte allocation 
      01-04 09:54:27.563: D/dalvikvm(888): GC_CONCURRENT freed 0K, 42% free 37302K/63464K, paused 34ms+4ms, total 51ms 
      01-04 09:54:27.563: D/dalvikvm(888): WAIT_FOR_CONCURRENT_GC blocked 15ms 
      01-04 09:54:28.094: D/dalvikvm(888): GC_FOR_ALLOC freed 20815K, 62% free 24678K/63464K, paused 40ms, total 40ms 
      01-04 09:54:28.234: D/dalvikvm(888): GC_FOR_ALLOC freed 1814K, 62% free 24678K/63464K, paused 22ms, total 23ms 
      01-04 09:54:28.284: I/dalvikvm-heap(888): Grow heap (frag case) to 52.947MB for 29085344-byte allocation 
      01-04 09:54:28.344: D/dalvikvm(888): GC_FOR_ALLOC freed 18935K, 63% free 34146K/91868K, paused 21ms, total 21ms 
      01-04 09:54:29.245: D/dalvikvm(888): GC_FOR_ALLOC freed 8191K, 63% free 34146K/91868K, paused 50ms, total 51ms 
      01-04 09:54:29.846: D/dalvikvm(888): GC_FOR_ALLOC freed 6821K, 63% free 34138K/91868K, paused 33ms, total 33ms 
      01-04 09:54:29.846: I/dalvikvm-heap(888): Forcing collection of SoftReferences for 43628012-byte allocation 
      01-04 09:54:29.866: D/dalvikvm(888): GC_BEFORE_OOM freed 76K, 63% free 34061K/91868K, paused 27ms, total 27ms 
      01-04 09:54:29.866: E/dalvikvm-heap(888): Out of memory on a 43628012-byte allocation. 

我不知道,如果压缩部分将在这样一个大的缓冲工作,但暂时,我唯一的问题是与阅读的巨大文本文件。

我希望你能帮我找到为什么这个失败,我应该改变什么才能使它工作。

编辑

下,如果我用这两个文件的加入量压缩代码:

 File previousLog = SystemEventsReceiver.getPreviousLog(); 
     if (previousLog.exists()) { 
      logsHolder.append(getTextFromFile(previousLog)); 
     } 

     File currentLog = SystemEventsReceiver.getCurrentLog(); 
     if (currentLog.exists()) { 
      logsHolder.append(getTextFromFile(currentLog)); 
     } 

     Log.v("MyApp", "uncompressed logs: " + logsHolder.toString().getBytes().length); 
     // Compress logs. 
     byte[] compressedLogs = null; 
     try { 
      ByteArrayOutputStream os = new ByteArrayOutputStream(logsHolder.length()); 
      GZIPOutputStream gos = new GZIPOutputStream(os); 
      gos.write(logsHolder.toString().getBytes()); 
      gos.close(); 
      compressedLogs = os.toByteArray(); 
      os.close(); 
     } catch (IOException e) { 
      e.printStackTrace(); 
     } 
     Log.v("MyApp", "compressed logs: " + compressedLogs.length); 
+1

“我不知道我在做什么错误,因为它在较小的文件上效果很好” - 您正尝试将大文件读入内存,并且您没有足够的内存来执行此操作。 “我应该改变什么才能使它工作” - 找到一种完成你的目标的方法,不需要把整个文件读入内存。 – CommonsWare

+0

@CommonsWare但是为什么我不能将它加载到内存中,如果它只有30MB?如果谈到XXXMB,我会理解。 – Storo

+1

我会尝试使用GZipOutputstream从Apache使用IOUtils。任何将文件作为整体读取和/或写入字符串/内存的解读都将失败。 – PhilW

回答

0

感谢@ PhilW的意见,我想出了以下解决方案:

Log.v("MyApp", "reading logs"); 
    // Get logs. 
    File previousLog = SystemEventsReceiver.getPreviousLog(); 
    File currentLog = SystemEventsReceiver.getCurrentLog(); 

    int completeLogSize = (int) (currentLog.length() + previousLog.length()); 
    Log.v("MyApp", "uncompressed logs: " + completeLogSize); 
    // Compress logs. 
    byte[] compressedLogs = null; 
    try { 
     ByteArrayOutputStream os = new ByteArrayOutputStream(completeLogSize); 
     GZIPOutputStream gzipOS = new GZIPOutputStream(os); 

     if (previousLog.exists()) { 
      addLogToGZIP(previousLog, gzipOS); 
     } 
     if (currentLog.exists()) { 
      addLogToGZIP(currentLog, gzipOS); 
     } 

     gzipOS.close(); 
     compressedLogs = os.toByteArray(); 
     os.close(); 
    } catch (IOException e) { 
     e.printStackTrace(); 
    } 
    Log.v("MyApp", "compressed logs: " + compressedLogs.length); 



    private void addLogToGZIP(File logFile, GZIPOutputStream gzipOS) { 
     byte[] bytes = new byte[1024]; 

     try { 
      BufferedInputStream buffer = new BufferedInputStream(new FileInputStream(logFile)); 
      while (buffer.read(bytes, 0, bytes.length) != -1) { 
       gzipOS.write(bytes); 
      } 
      buffer.close(); 
     } catch (FileNotFoundException e) { 
      e.printStackTrace(); 
     } catch (IOException e) { 
      e.printStackTrace(); 
     } 
    } 

我从每个日志文件中读取字节并将其直接添加到GZIPOutputStream。 它对55MB文件(约1.100.000行)甚至100MB(〜2.200.000行)的文件效果很好。