jvm停止分配内存的诡异问题
blueswind8306
2012-01-18
今天线上一后台服务器报警,上去一看,resin服务不动了,gc.log也不打日志了,jstat -gcutil看到很诡异的日志:
S0 S1 E O P YGC YGCT FGC FGCT GCT 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 100.00 100.00 100.00 64.23 15.90 1222 847.177 31 48.168 895.345 并且一直保持这个状态,难道MinorGC无法晋升到老生代了?但老生代还有很大空间啊(我配的老生代空间是6G,日志看还有36%=2.1G),当然也有可能是老生代的内存碎片太多导致无法晋升,但是如果是这样应该会持续Full GC才对,但是gc.log最后一条信息是几个小时前的几次MinorGC: 2012-01-18T16:03:19.770+0800: 100838.404: [GC 100838.404: [ParNew: 6389760K->798720K(6389760K), 2.4953480 secs] 8077569K->3170588K(12513280K), 2.4955040 secs] [Times: user=15.69 sys=0.70, real=2.49 secs] 2012-01-18T16:15:19.578+0800: 101558.212: [GC 101558.212: [ParNew: 6389760K->798720K(6389760K), 2.8422020 secs] 8761628K->3887766K(12513280K), 2.8423530 secs] [Times: user=18.94 sys=0.31, real=2.85 secs] 2012-01-18T17:03:09.411+0800: 104428.044: [GC 104428.044: [ParNew: 6389760K->798720K(6389760K), 0.6051860 secs] 9478806K->3909680K(12513280K), 0.6053150 secs] [Times: user=3.69 sys=0.00, real=0.60 secs] 2012-01-18T17:03:28.672+0800: 104447.305: [GC 104447.305: [ParNew(到这里就终止了) 想到用jstack看看线程栈的情况,但jstack抛了异常,放弃了。 又想到用jstat -J-Djstat.showUnsupported=true -snap [pid]命令看看,dump出了以下内容: java.ci.totalTime=11292479 java.cls.loadedClasses=4262 java.cls.sharedLoadedClasses=0 java.cls.sharedUnloadedClasses=0 java.cls.unloadedClasses=14 java.property.java.class.path="/usr/java/jdk1.6.0_06/lib/tools.jar:/opt/resin/lib/pro.jar:/opt/resin/lib/resin.jar:/opt/resin/lib/jaxrpc-15.jar:/opt/resin/lib/jta-101.jar:/opt/resin/lib/saaj.jar:/opt/resin/lib/activation.jar:/opt/resin/lib/jstl-11.jar:/opt/resin/lib/jms-11.jar:/opt/resin/lib/script-10.jar:/opt/resin/lib/j2ee-management-10.jar:/opt/resin/lib/j2ee-deploy-10.jar:/opt/resin/lib/jpa-15.jar:/opt/resin/lib/license.jar:/opt/resin/lib/jaxb-api.jar:/opt/resin/lib/jca-15.jar:/opt/resin/lib/webutil.jar:/opt/resin/lib/jsdk-15.jar:/opt/resin/lib/jsf-12.jar:/opt/resin/lib/javamail-14.jar:/opt/resin/lib/ejb-15.jar:/opt/resin/lib/resin-util.jar:/opt/resin/lib/portlet-10.jar:/opt/resin/lib/quercus.jar" java.property.java.endorsed.dirs="/usr/java/jdk1.6.0_06/jre/lib/endorsed" java.property.java.ext.dirs="/usr/java/jdk1.6.0_06/jre/lib/ext:/usr/java/packages/lib/ext" java.property.java.home="/usr/java/jdk1.6.0_06/jre" java.property.java.library.path="/usr/java/jdk1.6.0_06/jre/lib/amd64/server:/usr/java/jdk1.6.0_06/jre/lib/amd64:/usr/java/jdk1.6.0_06/jre/../lib/amd64:/opt/resin/libexec64:/usr/java/jdk1.6.0_06/jre/lib/amd64/server:/usr/java/jdk1.6.0_06/jre/lib/amd64:/usr/java/jdk1.6.0_06/jre/../lib/amd64:/opt/resin/libexec64:/usr/java/packages/lib/amd64:/lib:/usr/lib" java.property.java.vm.info="mixed mode" java.property.java.vm.name="Java HotSpot(TM) 64-Bit Server VM" java.property.java.vm.specification.name="Java Virtual Machine Specification" java.property.java.vm.specification.vendor="Sun Microsystems Inc." java.property.java.vm.specification.version="1.0" java.property.java.vm.vendor="Sun Microsystems Inc." java.property.java.vm.version="10.0-b22" java.rt.vmArgs="-Djava.util.logging.manager=com.caucho.log.LogManagerImpl -Djava.system.class.loader=com.caucho.loader.SystemClassLoader -Djava.awt.headless=true -Dresin.home=/opt/resin/ -Xmx13000m -Xms13000m -Xmn7020m -Xss256k -XX:PermSize=200m -XX:MaxPermSize=200m -XX:ParallelGCThreads=20 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:SurvivorRatio=7 -XX:TargetSurvivorRatio=100 -XX:MaxTenuringThreshold=15 -XX:CMSInitiatingOccupancyFraction=70 -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=30000 -XX:CMSScheduleRemarkEdenPenetration=1 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+CMSClassUnloadingEnabled -Xloggc:./log/gc.log" java.rt.vmFlags="" java.threads.daemon=41 java.threads.live=42 java.threads.livePeak=48 java.threads.started=709 sun.ci.compilerThread.0.compiles=570 sun.ci.compilerThread.0.method="" sun.ci.compilerThread.0.time=196 sun.ci.compilerThread.0.type=1 sun.ci.compilerThread.1.compiles=581 sun.ci.compilerThread.1.method="" sun.ci.compilerThread.1.time=188 sun.ci.compilerThread.1.type=1 sun.ci.lastFailedMethod="com/caucho/db/store/Store allocateBlock" sun.ci.lastFailedType=2 sun.ci.lastInvalidatedMethod="" sun.ci.lastInvalidatedType=0 sun.ci.lastMethod="java/lang/ref/Finalizer access$100" sun.ci.lastSize=5 sun.ci.lastType=1 sun.ci.nmethodCodeSize=2053280 sun.ci.nmethodSize=4052688 sun.ci.osrBytes=8039 sun.ci.osrCompiles=25 sun.ci.osrTime=268909 sun.ci.standardBytes=331951 sun.ci.standardCompiles=1125 sun.ci.standardTime=11023570 sun.ci.threads=2 sun.ci.totalBailouts=1 sun.ci.totalCompiles=1150 sun.ci.totalInvalidates=0 sun.cls.classInitTime=678610 sun.cls.classLinkedTime=1385121 sun.cls.classVerifyTime=1402102 sun.cls.initializedClasses=2887 sun.cls.isUnsyncloadClassSet=0 sun.cls.jniDefineClassNoLockCalls=7 sun.cls.jvmDefineClassNoLockCalls=66 sun.cls.jvmFindLoadedClassNoLockCalls=5903 sun.cls.linkedClasses=3466 sun.cls.loadInstanceClassFailRate=0 sun.cls.loadedBytes=9208632 sun.cls.methodBytes=6105748 sun.cls.nonSystemLoaderLockContentionRate=21 sun.cls.sharedLoadedBytes=0 sun.cls.sharedUnloadedBytes=0 sun.cls.systemLoaderLockContentionRate=0 sun.cls.time=927943 sun.cls.unloadedBytes=23024 sun.cls.unsafeDefineClassCalls=65 sun.gc.cause="Allocation Failure" sun.gc.collector.0.invocations=1222 sun.gc.collector.0.lastEntryTime=104447305185 sun.gc.collector.0.lastExitTime=104428649403 sun.gc.collector.0.name="PCopy" sun.gc.collector.0.time=847176958 sun.gc.collector.1.invocations=31 sun.gc.collector.1.lastEntryTime=97289733523 sun.gc.collector.1.lastExitTime=97290932380 sun.gc.collector.1.name="CMS" sun.gc.collector.1.time=48167741 sun.gc.generation.0.agetable.bytes.00=0 sun.gc.generation.0.agetable.bytes.01=74474616 sun.gc.generation.0.agetable.bytes.02=674104 sun.gc.generation.0.agetable.bytes.03=287575792 sun.gc.generation.0.agetable.bytes.04=3859456 sun.gc.generation.0.agetable.bytes.05=80594824 sun.gc.generation.0.agetable.bytes.06=2151296 sun.gc.generation.0.agetable.bytes.07=1021584 sun.gc.generation.0.agetable.bytes.08=89248 sun.gc.generation.0.agetable.bytes.09=247992 sun.gc.generation.0.agetable.bytes.10=768 sun.gc.generation.0.agetable.bytes.11=16520 sun.gc.generation.0.agetable.bytes.12=125728 sun.gc.generation.0.agetable.bytes.13=2856 sun.gc.generation.0.agetable.bytes.14=7816 sun.gc.generation.0.agetable.bytes.15=816 sun.gc.generation.0.agetable.size=16 sun.gc.generation.0.capacity=7361003520 sun.gc.generation.0.maxCapacity=7361003520 sun.gc.generation.0.minCapacity=7361003520 sun.gc.generation.0.name="new" sun.gc.generation.0.space.0.capacity=5725224960 sun.gc.generation.0.space.0.initCapacity=0 sun.gc.generation.0.space.0.maxCapacity=5725224960 sun.gc.generation.0.space.0.name="eden" sun.gc.generation.0.space.0.used=5725224960 sun.gc.generation.0.space.1.capacity=817889280 sun.gc.generation.0.space.1.initCapacity=0 sun.gc.generation.0.space.1.maxCapacity=817889280 sun.gc.generation.0.space.1.name="s0" sun.gc.generation.0.space.1.used=817889280 sun.gc.generation.0.space.2.capacity=817889280 sun.gc.generation.0.space.2.initCapacity=0 sun.gc.generation.0.space.2.maxCapacity=817889280 sun.gc.generation.0.space.2.name="s1" sun.gc.generation.0.space.2.used=817889280 sun.gc.generation.0.spaces=3 sun.gc.generation.0.threads=20 sun.gc.generation.1.capacity=6270484480 sun.gc.generation.1.maxCapacity=6270484480 sun.gc.generation.1.minCapacity=6270484480 sun.gc.generation.1.name="old" sun.gc.generation.1.space.0.capacity=6270484480 sun.gc.generation.1.space.0.initCapacity=6270484480 sun.gc.generation.1.space.0.maxCapacity=6270484480 sun.gc.generation.1.space.0.name="old" sun.gc.generation.1.space.0.used=4029159736 sun.gc.generation.1.spaces=1 sun.gc.generation.2.capacity=209715200 sun.gc.generation.2.maxCapacity=209715200 sun.gc.generation.2.minCapacity=209715200 sun.gc.generation.2.name="perm" sun.gc.generation.2.space.0.capacity=209715200 sun.gc.generation.2.space.0.initCapacity=209715200 sun.gc.generation.2.space.0.maxCapacity=209715200 sun.gc.generation.2.space.0.name="perm" sun.gc.generation.2.space.0.used=33343328 sun.gc.generation.2.spaces=1 sun.gc.lastCause="unknown GCCause" sun.gc.policy.collectors=2 sun.gc.policy.desiredSurvivorSize=817889280 sun.gc.policy.generations=3 sun.gc.policy.maxTenuringThreshold=15 sun.gc.policy.name="ParNew:CMS" sun.gc.policy.tenuringThreshold=15 sun.gc.tlab.alloc=689282504 sun.gc.tlab.allocThreads=10 sun.gc.tlab.fastWaste=0 sun.gc.tlab.fills=138 sun.gc.tlab.gcWaste=1455036 sun.gc.tlab.maxFastWaste=0 sun.gc.tlab.maxFills=128 sun.gc.tlab.maxGcWaste=331802 sun.gc.tlab.maxSlowAlloc=5 sun.gc.tlab.maxSlowWaste=533 sun.gc.tlab.slowAlloc=6 sun.gc.tlab.slowWaste=540 sun.os.hrt.frequency=1000000 sun.os.hrt.ticks=109150965158 sun.perfdata.majorVersion=2 sun.perfdata.minorVersion=0 sun.perfdata.overflow=0 sun.perfdata.size=32768 sun.perfdata.timestamp=151469 sun.perfdata.used=15008 sun.property.sun.boot.class.path="/usr/java/jdk1.6.0_06/jre/lib/resources.jar:/usr/java/jdk1.6.0_06/jre/lib/rt.jar:/usr/java/jdk1.6.0_06/jre/lib/sunrsasign.jar:/usr/java/jdk1.6.0_06/jre/lib/jsse.jar:/usr/java/jdk1.6.0_06/jre/lib/jce.jar:/usr/java/jdk1.6.0_06/jre/lib/charsets.jar:/usr/java/jdk1.6.0_06/jre/classes" sun.property.sun.boot.library.path="/usr/java/jdk1.6.0_06/jre/lib/amd64" sun.rt._sync_ContendedLockAttempts=37695448 sun.rt._sync_Deflations=70953 sun.rt._sync_EmptyNotifications=0 sun.rt._sync_FailedSpins=0 sun.rt._sync_FutileWakeups=3812372 sun.rt._sync_Inflations=70969 sun.rt._sync_MonExtant=4096 sun.rt._sync_MonInCirculation=0 sun.rt._sync_MonScavenged=0 sun.rt._sync_Notifications=335755 sun.rt._sync_Parks=36210278 sun.rt._sync_PrivateA=0 sun.rt._sync_PrivateB=0 sun.rt._sync_SlowEnter=0 sun.rt._sync_SlowExit=0 sun.rt._sync_SlowNotify=0 sun.rt._sync_SlowNotifyAll=0 sun.rt._sync_SuccessfulSpins=0 sun.rt.applicationTime=103548349475 sun.rt.createVmBeginTime=1326772961367 sun.rt.createVmEndTime=1326772961518 sun.rt.internalVersion="Java HotSpot(TM) 64-Bit Server VM (10.0-b22) for linux-amd64 JRE (1.6.0_06-b02), built on Mar 25 2008 01:03:02 by "java_re" with gcc 3.2.2 (SuSE Linux)" sun.rt.interruptedBeforeIO=0 sun.rt.interruptedDuringIO=0 sun.rt.javaCommand="com.caucho.server.resin.Resin -socketwait 41419 -resin-home /opt/resin/ start" sun.rt.jvmCapabilities="1000000000000000000000000000000000000000000000000000000000000000" sun.rt.jvmVersion=167772182 sun.rt.safepointSyncTime=789364 sun.rt.safepointTime=898814852 sun.rt.safepoints=13071 sun.rt.threadInterruptSignaled=0 sun.rt.vmInitDoneTime=1326772961454 sun.threads.vmOperationTime=896074670 注意到有一行: sun.ci.lastFailedMethod="com/caucho/db/store/Store allocateBlock" 还有一行: sun.gc.cause="Allocation Failure" 无奈下,kill -9杀掉了这个进程(正常停停不掉),换了一个比较新版本的jdk(update26)。不知道大牛们能不能帮忙分析一下这个问题的原因,难道是jdk老版本的bug吗(出问题的jdk版本是1.6.0_06)? |
|
RednaxelaFX
2012-01-20
看你的描述似乎是卡在做young GC的过程中了。当时你不应该那么快kill掉,至少应该先用pstack命令看看线程栈的状况。jstack无法连接上去时,试试jstack -F。
现在这个描述感觉要定位问题有点信息不足。 话说这个VM启动参数设置得也太奇怪了。 请问所有参数都有确实测试过它们在你的使用场景里的效果么?还是随便乱抄来的? s0和s1都满掉跟你的TargetSurvivorRatio=100有可能有关系。 虽然跟这帖的问题不直接相关,但 SoftRefLRUPolicyMSPerMB=0 会使得所有SoftReference在每次GC都立即失效。你确定这是你想要的行为么? sun.gc.cause显示的是当前的GC的起因;如果这里写着No GC就是当前没有在做GC。写着Allocation Failure说明现在正在做GC;这是很正常的,所有young GC的起因都会写这个。 sun.ci开头的属性是跟JIT编译相关的。这部分你可以不用关心。 升级JDK之后如果仍然出现同样的问题请把pstack的输出贴出来。 |
|
blueswind8306
2012-01-20
jstack -F当时用了,但也没有打印出线程堆栈信息。
pstack确实没有用,下次一定记得看看。。。 jvm参数是一个大并发前端应用经过很多轮调优后的最终定稿,整个调优过程我写了篇博文: http://blueswind8306.iteye.com/admin/blogs/1194438 但是这个应用不同,它是一个后端应用,它只是定期从数据库中批量读出数据然后写入文件。之前简单的从那个应用中拷贝过来jvm参数,出问题后我也意识到之前的jvm参数是很不合适的,所以现在的jvm参数改为如下设置: -Xmx8000m -Xms8000m -Xmn3000m -Xss256k -XX:PermSize=128m -XX:MaxPermSize=128m -XX:ParallelGCThreads=20 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:SurvivorRatio=6 -XX:TargetSurvivorRatio=80 -XX:MaxTenuringThreshold=15 -XX:CMSInitiatingOccupancyFraction=70 -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=30000 -XX:CMSScheduleRemarkEdenPenetration=1 -server -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+CMSClassUnloadingEnabled -XX:+PrintTenuringDistribution -Xloggc:./log/gc.log TargetSurvivorRatio=100这个参数只是为了最大化的利用S0/S1的空间,因为之前看过毕玄的一篇博文:说说MaxTenuringThreshold这个参数,按照我的理解,TargetSurvivorRaito=100就等于设置desired_survivor_size=survivor_capacity,这样的坏处就是动态调整MaxTenuringThreshold的策略等于被关闭了,如果从Eden晋升到Survivor的内存变大,就可能会直接晋升到Old。而我这个应用经过jstat -gcutil的观察,Survivor在用的空间占Survivor最大空间的比例不大。所以等于浪费了一些Survivor的空间为了保证即使在Eden->Survivor变大的情况下也可以适应。不知道我的理解是不是正确的。。。 SoftRefLRUPolicyMSPerMB=0这个配置是之前遗留下来的配置,从代码review来看是没有什么地方使用SoftRef的,所以就沿用了此参数 这两天我通过jmap -histo发现了一个很大的问题是这个应用每个小时都会去数据库的一张过千万的表select * 到一个hashmap中,导致hashmap过大,现在已经fixed了。回想了一下,出问题的时间也应该是在这个过程,综合考虑可能是那个时间点Eden满导致MinorGC的复制回收,而在复制S0到S1的过程中又出现Old碎片过多导致无法晋升到Old,从而触发Full GC,而Full GC为什么没有抛OutOfMemoryError就搞不懂了。。。 |
相关讨论
相关资源推荐
- 高性能 Socket 组件 HP-Socket v3.1.3 正式发布
- 高性能 Socket 组件 HP-Socket v3.1.3 正式发布(tcp/udp,IOCP模型)
- 高性能 Socket 组件 HP-Socket v3.2.1 正式发布
- 高性能 Socket 组件 HP-Socket v3.2.1-RC3 公布
- 高性能 Socket 组件 HP-Socket v3.2.1-RC4 发布
- 高性能 Socket 组件 HP-Socket v3.2.1-RC2 发布
- 遥感目标检测 NWPU VHR-10 目标检测数据集.zip
- postgis-bundle-pg96x64-setup-3.2.3-1.exe
- 基于线性预测的语音合成实验MATLAB代码
- 使用Scrapy采集淘宝数据,Flask展示.zip