[讨论] 通过kill -3 pid 来输出Thread Dump日志信息的时效性

e456 2014-02-18
通过
kill -3 pid
的方式发送一个SIGQUIT信号给Java应用之后,通常会有当前的Thread Dump输出。

最近在一个应用启动脚本中用如下的逻辑来停止应用:
#输出一次Thread Dump,结束进程并等待3秒让应用释放资源
kill -3 pid && kill pid && sleep 3
#3秒之后强制kill 进程
kill -9 pid


最近发现有时候日志文件中并没有 Thread Dump信息,疑惑JVM是如何处理SIGQUIT,SIGTERM,SIGKILL这三个信号的,以及处理的时效性,相互之间有无影响。请高人解惑,谢谢。

java -version 输出如下:
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
RednaxelaFX 2014-02-19
先回答楼主的一个问题:SIGQUIT是咋处理的。
假定这个程序在JVM初始化之后没有别的代码注册了新的SIGQUIT的signal handler,那么HotSpot VM在收到SIGQUIT之后会在一个专门的signal handler thread处理。该线程的入口函数为signal_thread_entry():
http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/tip/src/share/vm/runtime/os.cpp
// SIGBREAK is sent by the keyboard to query the VM state
#ifndef SIGBREAK
#define SIGBREAK SIGQUIT
#endif

// sigexitnum_pd is a platform-specific special signal used for terminating the Signal thread.


static void signal_thread_entry(JavaThread* thread, TRAPS) {
  os::set_priority(thread, NearMaxPriority);
  while (true) {
    int sig;
    {
      // FIXME : Currently we have not decieded what should be the status
      //         for this java thread blocked here. Once we decide about
      //         that we should fix this.
      sig = os::signal_wait();
    }
    if (sig == os::sigexitnum_pd()) {
       // Terminate the signal thread
       return;
    }

    switch (sig) {
      case SIGBREAK: {
        // Check if the signal is a trigger to start the Attach Listener - in that
        // case don't print stack traces.
        if (!DisableAttachMechanism && AttachListener::is_init_trigger()) {
          continue;
        }
        // Print stack traces
        // Any SIGBREAK operations added here should make sure to flush
        // the output stream (e.g. tty->flush()) after output.  See 4803766.
        // Each module also prints an extra carriage return after its output.
        VM_PrintThreads op;
        VMThread::execute(&op);
        VM_PrintJNI jni_op;
        VMThread::execute(&jni_op);
        VM_FindDeadlocks op1(tty);
        VMThread::execute(&op1);
        Universe::print_heap_at_SIGBREAK();
        if (PrintClassHistogram) {
          VM_GC_HeapInspection op1(gclog_or_tty, true /* force full GC before heap inspection */,
                                   true /* need_prologue */);
          VMThread::execute(&op1);
        }
        if (JvmtiExport::should_post_data_dump()) {
          JvmtiExport::post_data_dump();
        }
        break;
      }
      default: {
        // Dispatch the signal to java
        // ...
      }
    }
  }
}

打印线程栈的动作由VM_PrintThreads实现,在VM thread上执行。于是问题就来了:如果这个JVM实例已经hang了,那它将无法响应任何外部请求,对它发SIGQUIT当然也得不到处理。所以有时候kill -3看不到线程栈是正常的。

至于时效性,所有VM operation都是先被放到一个队列里,然后由VM thread逐个处理。如果当前该队列是空的,那kill -3就可以几乎“实时”执行爬栈动作,否则得等前面的VM operation先完成,那就会延迟一会儿。
Global site tag (gtag.js) - Google Analytics