2012-07-17 90 views
5

我们调整我们的GC以实现最小的“停止世界”暂停。烫发和寿命长的一代表现得很好。年轻的正常工作的大部分时间,以及暂停通常不超过500毫秒(注[时间:用户= 0.35 SYS = 0.02,实际= 0.06秒]):偶然的年轻垃圾收集暂停

{Heap before GC invocations=11603 (full 60): 
par new generation total 3640320K, used 3325226K [0x0000000600000000, 0x00000006f6e00000, 0x00000006f6e00000) 
    eden space 3235840K, 100% used [0x0000000600000000, 0x00000006c5800000, 0x00000006c5800000) 
    from space 404480K, 22% used [0x00000006de300000, 0x00000006e3a4a898, 0x00000006f6e00000) 
    to space 404480K, 0% used [0x00000006c5800000, 0x00000006c5800000, 0x00000006de300000) 
concurrent mark-sweep generation total 4147200K, used 1000363K [0x00000006f6e00000, 0x00000007f4000000, 0x00000007f4000000) 
concurrent-mark-sweep perm gen total 196608K, used 133030K [0x00000007f4000000, 0x0000000800000000, 0x0000000800000000) 
2012-07-16T14:36:05.641+0200: 427048.412: [GC 427048.412: [ParNew 
Desired survivor size 207093760 bytes, new threshold 4 (max 4) 
- age 1: 8688880 bytes, 8688880 total 
- age 2: 12432832 bytes, 21121712 total 
- age 3: 18200488 bytes, 39322200 total 
- age 4: 20473336 bytes, 59795536 total 
: 3325226K->75635K(3640320K), 0.0559610 secs] 4325590K->1092271K(7787520K), 0.0562630 secs] [Times: user=0.35 sys=0.02, real=0.06 secs] 
Heap after GC invocations=11604 (full 60): 
par new generation total 3640320K, used 75635K [0x0000000600000000, 0x00000006f6e00000, 0x00000006f6e00000) 
    eden space 3235840K, 0% used [0x0000000600000000, 0x0000000600000000, 0x00000006c5800000) 
    from space 404480K, 18% used [0x00000006c5800000, 0x00000006ca1dcf40, 0x00000006de300000) 
    to space 404480K, 0% used [0x00000006de300000, 0x00000006de300000, 0x00000006f6e00000) 
concurrent mark-sweep generation total 4147200K, used 1016635K [0x00000006f6e00000, 0x00000007f4000000, 0x00000007f4000000) 
concurrent-mark-sweep perm gen total 196608K, used 133030K [0x00000007f4000000, 0x0000000800000000, 0x0000000800000000) 
} 

但是,有时,通常一次一天,我们得到特别长的年轻垃圾收集时间(注[时间:用户= 0.41 SYS = 0.01,实际= 5.51秒]):

{Heap before GC invocations=7884 (full 37): 
par new generation total 3640320K, used 3304448K [0x0000000600000000, 0x00000006f6e00000, 0x00000006f6e00000) 
    eden space 3235840K, 100% used [0x0000000600000000, 0x00000006c5800000, 0x00000006c5800000) 
    from space 404480K, 16% used [0x00000006c5800000, 0x00000006c9b00370, 0x00000006de300000) 
    to space 404480K, 0% used [0x00000006de300000, 0x00000006de300000, 0x00000006f6e00000) 
concurrent mark-sweep generation total 4147200K, used 1967225K [0x00000006f6e00000, 0x00000007f4000000, 0x00000007f4000000) 
concurrent-mark-sweep perm gen total 189100K, used 112825K [0x00000007f4000000, 0x00000007ff8ab000, 0x0000000800000000) 
2012-07-15T01:23:25.474+0200: 293088.245: [GC 293093.636: [ParNew 
Desired survivor size 207093760 bytes, new threshold 4 (max 4) 
- age 1: 30210472 bytes, 30210472 total 
- age 2: 11614600 bytes, 41825072 total 
- age 3: 8591680 bytes, 50416752 total 
- age 4: 7779600 bytes, 58196352 total 
: 3304448K->73854K(3640320K), 0.1158280 secs] 5271674K->2046454K(7787520K), 0.1181990 secs] [Times: user=0.41 sys=0.01, real=5.51 secs] 
Heap after GC invocations=7885 (full 37): 
par new generation total 3640320K, used 73854K [0x0000000600000000, 0x00000006f6e00000, 0x00000006f6e00000) 
    eden space 3235840K, 0% used [0x0000000600000000, 0x0000000600000000, 0x00000006c5800000) 
    from space 404480K, 18% used [0x00000006de300000, 0x00000006e2b1fb40, 0x00000006f6e00000) 
    to space 404480K, 0% used [0x00000006c5800000, 0x00000006c5800000, 0x00000006de300000) 
concurrent mark-sweep generation total 4147200K, used 1972599K [0x00000006f6e00000, 0x00000007f4000000, 0x00000007f4000000) 
concurrent-mark-sweep perm gen total 189100K, used 112825K [0x00000007f4000000, 0x00000007ff8ab000, 0x0000000800000000) 
} 

下面是jstat -gccause输出相关:

Timestamp   S0  S1  E  O  P  YGC  YGCT FGC FGCT  GCT LGCC     GCC     
    293083.2 16.96 0.00 96.26 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  No GC    
    293083.2 16.96 0.00 96.26 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  No GC    
    293084.2 16.96 0.00 97.69 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  No GC    
    293085.3 16.96 0.00 98.32 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  No GC    
    293086.3 16.96 0.00 99.32 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  No GC    
Timestamp   S0  S1  E  O  P  YGC  YGCT FGC FGCT  GCT LGCC     GCC     
    293087.3 16.96 0.00 99.55 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  No GC    
    293088.3 16.96 0.00 100.00 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  Allocation Failure 
    293089.3 16.96 0.00 100.00 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  Allocation Failure 
    293090.3 16.96 0.00 100.00 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  Allocation Failure 
    293091.3 16.96 0.00 100.00 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  Allocation Failure 
    293092.3 16.96 0.00 100.00 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  Allocation Failure 
    293093.3 16.96 0.00 100.00 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  Allocation Failure 
    293084.2 16.96 0.00 100.00 47.44 59.66 7884 772.326 74 10.893 783.219 unknown GCCause  Allocation Failure 
    293094.3 0.00 18.26 6.23 47.56 59.66 7885 772.442 74 10.893 783.334 unknown GCCause  No GC    
    293094.6 0.00 18.26 6.71 47.56 59.66 7885 772.442 74 10.893 783.334 unknown GCCause  No GC    
    293095.3 0.00 18.26 6.85 47.56 59.66 7885 772.442 74 10.893 783.334 unknown GCCause  No GC    
    293095.6 0.00 18.26 6.92 47.56 59.66 7885 772.442 74 10.893 783.334 unknown GCCause  No GC    
    293096.2 0.00 18.26 9.84 47.56 59.66 7885 772.442 74 10.893 783.334 unknown GCCause  No GC    
    293096.6 0.00 18.26 10.11 47.56 59.66 7885 772.442 74 10.893 783.334 unknown GCCause  No GC 

“分配失败”appe ars作为其他地方的GC原因,但始终作为单一条目。当它像这样的序列出现时,它与长时间的GC暂停相关联。我研究了Oracle JVM来源,并且“分配失败”看起来像是一种非常自然的情况:年轻人无法获得新数据,现在该清理了。在暂停发生之前,我检查了系统中发生的任何内存密集,意外的行为,但没有发现任何可疑行为。

请注意,在暂停时间内,年轻垃圾收集时间并没有多少增加。我的记忆和GC设置如下(记录设置略):

-Xms6000m 
-Xmn2950m 
-Xmx6000m 
-XX:MaxPermSize=192m 
-XX:+UseConcMarkSweepGC 
-XX:+UseParNewGC 
-Dsun.rmi.dgc.client.gcInterval=9223372036854775807 
-Dsun.rmi.dgc.server.gcInterval=9223372036854775807 
-XX:CMSInitiatingOccupancyFraction=50 
-XX:+CMSScavengeBeforeRemark 
-XX:+CMSClassUnloadingEnabled 

测试也与8000m和12000m堆。机:

  • 8核心与内存
  • 16GB
  • 24核的内存50GB

所以我的基本问题是:为什么ParNewGC偶​​然的行为这种方式?

回答

5

在执行GC之前,必须让每个线程都安全点(它不会立即停止每个线程)。如果您长时间运行JNI调用或系统调用,则可能需要很长时间才能达到安全点。在这种情况下你看到的是一个很长的停顿,尽管这个集合本身的时间长度正常。

+0

彼得,这是否意味着一个新的内存分配请求将阻塞,直到达到安全点并回收内存?或者该请求是否会立即从其他分区完成? – 2012-07-17 08:51:49

+0

新的内存分配会触发GC,从而导致所有线程在GC启动之前达到安全点。这可能意味着唤醒被OS停止的线程,以便它们可以处于安全的位置。 – 2012-07-17 08:55:15

+0

因此,所有线程都会阻止等待该滞后线程到达安全点? – 2012-07-17 08:56:49