[讨论] CMS GC问题请教
chainhou
2014-10-10
R大及各位好,
我们JVM参数配置: -Xms4096m -Xmx4096m -Xmn1024m -XX:PermSize=1024m -XX:SurvivorRatio=4 -XX:ParallelGCThreads=12 -XX:+UseConcMarkSweepGC -XX:CMSFullGCsBeforeCompaction=5 -XX:+UseCMSCompactAtFullCollection -server -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/home/logs/gc_20141006-070235.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/heapdump/-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -server -Xdebug -Xnoagent -Djava.compiler=NONE 在线上一天之后,Full GC次数有25次,其中18次是Full GC(System).项目中没有显示的调用过System.gc,可能是第三方库中的调用,不知道如果设置了DisableExplicitGC之后,会不会影响性能。 现在观察到的部分gc日志如下: [Times: user=0.80 sys=0.07, real=0.11 secs] 2014-10-10T16:49:57.396+0800: 72029.912: [GC 72029.913: [ParNew: 793180K->91789K(873856K), 0.0988350 secs] 3210578K->2509187K(4019584K), 0.0989660 secs] [Times: user=0.53 sys=0.07, real=0.10 secs] 2014-10-10T16:50:15.290+0800: 72047.807: [GC 72047.807: [ParNew: 790925K->174720K(873856K), 1.2826750 secs] 3208323K->2911715K(4019584K), 1.2828060 secs] [Times: user=3.65 sys=0.07, real=1.28 secs] 2014-10-10T16:50:16.643+0800: 72049.160: [GC [1 CMS-initial-mark: 2736995K(3145728K)] 2919105K(4019584K), 0.1911190 secs] [Times: user=0.19 sys=0.00, real=0.20 secs] 2014-10-10T16:50:16.835+0800: 72049.352: [CMS-concurrent-mark-start] 2014-10-10T16:50:18.589+0800: 72051.106: [GC 72051.106: [ParNew: 873856K->148097K(873856K), 0.3760150 secs] 3610851K->3043718K(4019584K), 0.3761440 secs] [Times: user=1.88 sys=0.03, real=0.37 secs] 2014-10-10T16:50:18.966+0800: 72051.483: [CMS-concurrent-mark: 1.743/2.131 secs] [Times: user=10.43 sys=0.43, real=2.13 secs] 2014-10-10T16:50:18.966+0800: 72051.483: [CMS-concurrent-preclean-start] 2014-10-10T16:50:19.315+0800: 72051.832: [CMS-concurrent-preclean: 0.305/0.349 secs] [Times: user=1.17 sys=0.20, real=0.34 secs] 2014-10-10T16:50:19.315+0800: 72051.832: [CMS-concurrent-abortable-preclean-start] 2014-10-10T16:50:19.710+0800: 72052.227: [CMS-concurrent-abortable-preclean: 0.368/0.395 secs] [Times: user=1.23 sys=0.15, real=0.40 secs] 2014-10-10T16:50:19.710+0800: 72052.227: [GC[YG occupancy: 606126 K (873856 K)]72052.227: [Rescan (parallel) , 0.0979870 secs]72052.325: [weak refs processing, 0.0093660 secs] [1 CMS-remark: 2895620K(3145728K)] 3501746K(4019584K), 0.1074750 secs] [Times: user=0.97 sys=0.00, real=0.11 secs] 2014-10-10T16:50:19.818+0800: 72052.335: [CMS-concurrent-sweep-start] 2014-10-10T16:50:20.246+0800: 72052.763: [GC 72052.763: [ParNew: 847233K->847233K(873856K), 0.0000310 secs]72052.763: [CMS2014-10-10T16:50:22.320+0800: 72054.836: [CMS-concurrent-sweep: 2.499/2.501 secs] [Times: user=3.43 sys=0.10, real=2.50 secs] (concurrent mode failure) (concurrent mode failure): 2895620K->1698968K(3145728K), 2.0835520 secs] 3742854K->2546202K(4019584K), [CMS Perm : 348959K->348959K(1048576K)], 2.0837310 secs] [Times: user=2.07 sys=0.00, real=2.09 secs] 2014-10-10T16:50:22.331+0800: 72054.848: [GC 72054.848: [ParNew: 847233K->125680K(873856K), 0.2666670 secs] 2546203K->1922680K(4019584K), 0.2667710 secs] [Times: user=1.02 sys=0.03, real=0.26 secs] 2014-10-10T17:42:54.952+0800: 75207.468: [GC 75207.468: [ParNew: 720362K->54047K(873856K), 0.0882010 secs] 3431575K->2766364K(4019584K), 0.0883190 secs] [Times: user=0.25 sys=0.00, real=0.08 secs] 2014-10-10T17:43:04.467+0800: 75216.984: [GC 75216.984: [ParNew: 753183K->174720K(873856K), 1.9135480 secs] 3465500K->3217106K(4019584K), 1.9136790 secs] [Times: user=3.92 sys=0.01, real=1.92 secs] 2014-10-10T17:43:06.382+0800: 75218.899: [GC [1 CMS-initial-mark: 3042386K(3145728K)] 3217262K(4019584K), 0.1825900 secs] [Times: user=0.18 sys=0.00, real=0.18 secs] 2014-10-10T17:43:06.565+0800: 75219.082: [CMS-concurrent-mark-start] 2014-10-10T17:43:09.151+0800: 75221.668: [Full GC 75221.668: [CMS2014-10-10T17:43:09.156+0800: 75221.673: [CMS-concurrent-mark: 2.516/2.591 secs] [Times: user=6.78 sys=0.37, real=2.59 secs] (concurrent mode failure): 3042386K->1584414K(3145728K), 7.2456380 secs] 3916242K->1584414K(4019584K), [CMS Perm : 349431K->349045K(1048576K)], 7.2459060 secs] [Times: user=6.96 sys=0.02, real=7.24 secs] 2014-10-10T17:43:17.408+0800: 75229.925: [GC 75229.925: [ParNew: 699136K->103397K(873856K), 0.0460040 secs] 2283550K->1687812K(4019584K), 0.0461140 secs] [Times: user=0.27 sys=0.00, real=0.05 secs] 2014-10-10T17:43:18.931+0800: 75231.448: [GC 75231.448: [ParNew: 802533K->108792K(873856K), 0.1397780 secs] 2386948K->1693207K(4019584K), 0.1398980 secs] [Times: user=1.03 sys=0.10, real=0.14 secs] 2014-10-10T16:10:48.764+0800: 69681.280: [Full GC (System) 69681.280: [CMS: 2331400K->1263875K(3145728K), 6.0625200 secs] 2531054K->1263875K(4019584K), [CMS Perm : 349156K->348772K(1048576K)], 6.0627030 secs] [Times: user=6.06 sys=0.00, real=6.06 secs] 从上面的日志看concurrent mode failure之后,并没有随之进行Full GC,而且Full GC(System)的执行时间也是越来越长这是否正常? 目前想要减少Full GC的影响,或者说减少应用的暂停时间,我打算把CMS 收集器的XX:CMSInitiatingOccupancyFraction调低些,让每次Full GC产生时应用的暂停时间减小,不知道这种思路是否可行? ![]() 谢谢! |
|
youlong699
2014-10-16
按照我粗浅的理解,-Xmn1024m 小了点,导致过多内容进入old,一般young 与old 1:1足够了吧? 当然有些场景比较变态的另说了。
另外,jstat -gcutil 观察下S区的使用率,SurvivorRatio=4 是不是太小了点?如果太小,会浪费young区,导致本来就不宽裕的eden不够用。 另外,对于谁调用了system.gc,可以用btrace跟一下,一下就能抓出来触发的地方。当然很可能是一个后台线程在周期性调度,那么在启动的时候,btrace再抓一下是谁创建了该线程就ok了。 |