[讨论] 自由讨论帖/建议帖/资料补充帖/马克帖(谢绝纯水)

RednaxelaFX 2011-01-03
2006年初的一个演示稿
Simon Ritter: Java SE 6 & The Future of Java
引用其中一张图:

让我们看看当时计划中Java SE 7里的特性,现在都如何了:
Filesystem API与Asynchronous I/O => 集成到NIO.2里了,会在JDK 7里出现
Tiered Compilation => JDK 7的HotSpot在server模式会默认开启这个选项
Method References => 进化为现在的MethodHandle了
invokedynamic => 虽然名字还跟当时的一样,但已经得到了大幅度的进化,可以指望它对JVM上的动态语言带来很大帮助
Language-level XML => 已死
Modules => 等JDK 8
Class-data sharing for applications => 应该是在进行中的,但没留意到具体在代码上有啥动静

=================================

对比一下俩图,看看JDK中工具方面的进步:
1.4.2


1.5.0


6
喵的JDK6文档的这部分现在不是图片了…直接点链接过去看好了。
易卡螺丝君 2011-01-03
吐槽 NIO这个名字就很...

现在更直接了 叫NIO2
这事告诉我们 做产品有个好名字是多么重要 特别是以后还要出到2的时候

FX: NIO…牛,么
RednaxelaFX 2011-01-15
Firefox 4 Beta 9 – a huge pile of awesome
引用
The JaegerMonkey has landed.

And you might have noticed that it’s really fast. This is the world’s first third-generation JavaScript engine, using Baseline JIT technology similar to engines found in other browsers and kicked up a level with the Tracing engine found in Firefox 3.6. As such, we’re competitive on benchmarks such as Sunspider and V8, but we’re also fast at a whole mess of things that we expect to see in the set of next-generation web applications, hence Kraken.

诶?为啥JaegerMonkey是首个第三代JS引擎,那前两代是啥…
易卡螺丝君 2011-01-15
SpiderMonkey
tracemonkey

各种monkey
RednaxelaFX 2011-01-15
易卡螺丝君 写道
SpiderMonkey
tracemonkey

各种monkey

俨然不是这么分的撒。人家说的是“世界上首个第三世代JS引擎”,那肯定是跟别家的JS引擎放在一起比较的哇。再说JaegerMonkey笼统来说也是SpiderMonkey嘛,当年的TraceMonkey笼统来说也可以叫SpiderMonkey。

Boris在那帖的回复里说:
引用
Boris wrote on January 15th, 2011 at 8:31 pm:
1) I’m just guessing here, but I think the obvious generations for JS engines would be interpreter-only, jit, and multi-level jit with dynamic recompilation using a compiler that does more optimizations for the hot paths.
2) At the moment JagerMonkey only runs compiled code, but when we decide to trace a loop we drop back into the interpreter for one loop iteration, because the interpreter drives the trace recorder. That’s something we plan to probably change, but it was too much work to get it done in the Firefox 4 timeframe. There have also been some suggestions to run a function in the interpreter a few times before compiling it with JM, to save memory and in some cases time (because compilation costs time, and if a function runs only once it may not be worth compiling it).
3) Not sure what you mean by “based” on, but the tracer records a trace by getting callbacks from the interpreter with the opcodes the interpreter is interpreting. For now.
易卡螺丝君 2011-01-15

顺着FX君说的围观了一下 现有各种ECMAScript engines

 

这个第三代的说法 令人起疑...

RednaxelaFX 2011-02-04
http://download.java.net/jdk6/6u25/promoted/b01/index.html

HotSpot 20开始,nmethod里的常量表被放置在指令序列之前了。以前是放在最后的。嗯整理资料的时候得注意一下
6961697
引用
To better support PC-relative addressing of nmethod constant table entries on SPARC, the constants section should be moved before the instruction section.  This is a preparation for 6961690.


6953144
引用
oops should be loaded from the constant table of an nmethod instead of materializing them with a long code sequence.


----------------------

6953144
tiered compilation的新实现也开始整合到实际发布版本里了。据说比以前的稳定了许多。

----------------------

Bug ID 7003271: Hotspot should track cumulative Java heap bytes allocated on a per-thread basis
引用
It's possible today to get cumulative per-thread cpu time via the ThreadMXBean.
It would be very useful to get the cumulative per-thread number of bytes allocated in the Java heap as well for resource management purposes.

这个新功能应该也会在6u25出来吧。

====================================================================

JDK7 build 129 (2011-02-10)里的HotSpot 21已经开始移除PermGen,改为引用计数方式在native heap上管理SymbolTable
Bug ID 6990754: Use native memory and reference counting to implement SymbolTable

====================================================================

JMX Agent work in JDK7...
引用
As part of the JVM Convergence effort, we plan to enhance the JDK
JMX agent with select features from JRockit, including...

1. New commandline interface (jcmd) for basic JVM diagnostic commands
2. New Lifecycle support to independently start/stop the JMX agent
3. Support for select additional performance counters

This work is planned for JDK7, but requires substantial codebase
restructuring to align with HotSpot. Also, the project requires
careful thought around package organization and associated public,
private, and protected APIs. Work is ongoing and progressing nicely
towards a likely end of March initial delivery. We'll provide updates
and more details as the date approaches.

- Jim

新JMX实现来了

@OpenJDK
Paul Hohensee added tracking of cumulative Java heap bytes allocated on a per-thread basis to HotSpot: http://hg.openjdk.java.net/jdk7/jdk7/hotspot/rev/b1a2afa37ec4

@OpenJDK
John Rose - JSR 292 formal Public Review: http://blogs.sun.com/jrose/entry/jsr_292_formal_public_review

@OpenJDK
Coleen Phillimore enhanced the HotSpot error reporting mechanism when C heap is exhausted in various places: http://hg.openjdk.java.net/jdk7/hotspot-rt/hotspot/rev/36c186bcc085

@OpenJDK
Maurizio Cimadamore fixed spurious transitional 292 warnings when using method references in Project #Lambda: http://hg.openjdk.java.net/lambda/lambda/langtools/rev/89631f9e86b7

@OpenJDK
Alan Bateman removed the no longer used HPI library: core-libs-dev thread: http://mail.openjdk.java.net/pipermail/core-libs-dev/2011-January/005580.html commit: http://hg.openjdk.java.net/jdk7/tl/jdk/rev/5124c2a50539

@OpenJDK
Alan Bateman's take on how the file system API might evolve in the future, post JDK 7: http://mail.openjdk.java.net/pipermail/nio-dev/2011-January/001174.html

@OpenJDK
HotSpot 20 branched off for use in 6 update release train: http://mail.openjdk.java.net/pipermail/jdk6-dev/2011-February/002276.html

@OpenJDK
Tom Rodriguez made some small improvements to the string compare intrinsics in HotSpot - http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/6bbaedb03534 bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7016474

@OpenJDK
Staffan Larssen added support for creating 64-bit Visual Studio projects: http://bit.ly/es2ZB0 hotspot-dev thread: http://bit.ly/dQOu5R

@OpenJDK
Stefan Karlsson removed unused parallel compaction code from HotSpot - changeset : http://bit.ly/g4dp1t - bug: http://bit.ly/husFhc

@tablewhite
#JDK 7: New Interfaces, Classes, Enums, and Methods http://marxsoftware.blogspot.com/2011/03/jdk-7-new-interfaces-classes-enums-and.html

@ijuma
HotSpot now supports transparent huge pages on Linux (available since 2.6.38) thanks to Andrew Haley's contribution http://hg.openjdk.java.net/jdk7/jdk7/hotspot/rev/139667d9836a

HotRockit – What to Expect from Oracle’s Converged JVM
Slides: http://hirt.se/presentations/WhatToExpect.ppt

引用
oracletechnet Justin Kestelyn
Reinhold: big data support, improved data integration, metaobject protocol may also be themes in JDK9 #eclipsecon


引用
IanSkerrett Ian Skerrett
Java 9 - continuation, value classes,big data, meta-object protocol, data integration. #eclipsecon


====================================================================

要让GC与程序语义更好的关联在一起,如果GC能提供一些primitives让程序能标注一些东西的语义就好了。例如说一些很小但是long-life的对象。
LOH本意就是以“大对象更可能是老对象”“大对象复制起来开销高”这种heuristic为依据的。但不总是可靠。还有什么别的办法能获取语义么。
例如说,让ThreadLocal<T>分配到栈上去之类的?Gil Tene是说stack allocation可能好处不大,而且实现起来麻烦。不实际实现一个出来总觉得不甘心。
话说,能不能想办法让System.gc()变成只针对当前线程来做GC?

分代式GC最讨厌的就是中等寿命的对象,让它在年轻的代里多经历几次GC就会增加GC时间,让它进年老的代里就会增加碎片。总之麻烦。

Leveled Garbage Collection
引用
Leveled Garbage Collection by Guanshan Tong and Michael J. O'Donnell:

Generational garbage collection (GGC) is one of the most popular garbage collection techniques. GGC gains a performance advantage by performing minor collections on the younger objects in the heap, reducing the number of major collections of the whole heap. A promotion policy determines when an object moves from the younger generation to the older. The design of GGC has been justified by the plausible assumption that many objects die very young, and a few live a very long time. But, attempts to tune the performance of GGC by adjusting the promotion policy have been disappointing - only the simplest immediate promotion policy has proved attractive. The success of GGC is probably due to simplicity and to avoiding scans of the whole heap, rather than to accurate lifetime predictions.
This paper presents Leveled Garbage Collection (LGC), a new algorithm that is not based on object ages. It uses a heap structure and collection scheme similar to those of generational garbage collectors, and has a non-age-based promotion policy that doesn't promote all of the live objects, but still guarantees ample free space immediately after each garbage collection. By tuning LGC's promotion policy, we can often improve on GGC with immediate promotion.
Performance comparisons show that LGC outperforms GGC with immediate promotion policy in many cases, while losing only slightly on cases favorable to immediate promotion. LGC has a substantial advantage when the heap fits in main memory, and an even greater advantage as the heap gets paged to disk.

Leveled GC is based on a more general heuristic than generational GC, in that it tries to keep as many objects as possible in the nursery because minor collections are so much cheaper. What I found most interesting about this paper is that it scales well with virtual memory, which as we know can degrade performance significantly. They provide benchmarks demonstrating a marked difference when large heap sizes trigger paging (Section 5.2.2). LGC performance is hardly affected, while the runtime of generational GC degrades significantly.
RednaxelaFX 2011-02-16
JSR 292链接收集

Re: Projects, which use JSR292
引用
From: Rémi Forax <forax@...>
Subject: Re: Projects, which use JSR292
Newsgroups: gmane.comp.java.openjdk.mlvm.devel
Date: 2011-02-15 17:53:18 GMT (7 hours and 42 minutes ago)
On 02/15/2011 06:13 PM, John Rose wrote:
> On Feb 15, 2011, at 6:50 AM, Kirill Shirokov wrote:
>
>> - JRuby by Charles Oliver Nutter
>> - PHP.reboot by Remi Forax
>> - Smalltalk implementation by Mark Roos (in progress)

And Java lambda in JDK8.
The current prototype already use JSR 292.

>
> I think this recent thread is about an OCaml implementation by Xavier Clerc:
>    http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-February/002464.html

I have also a metaclass core prototype named Gru (as a possible core for
Groovy 2)
but it's far from being usable by anybody.
And a not yet finished implementation of JSR 292 on Android.

As an implementer, the first benefit of JSR 292 is to greatly simplify
the implementation
of several common patterns:
- invokedynamic+CallSite.setTarget() allows to easily implement any kind
of inlining caches
    (one element cache or bymorphic cache, tree of decision, etc.)
    which are the mothers of all optimizations of any dynamic languages.
- MutableCallSite.syncAll() allows to do thread safe deoptimization which is
   the root of all optimistic optimizations. No guard anymore ! (in
reality, less guards).
- MethodHandle give you lambda for free.
- Predefined method handles avoid code generation for doing the plumbing
   required by features like varargs, spreads, named parameters.
   i.e adaptation between the calling and the target method.
- MethodHandles.asInstance() avoid code generation when bridging Java
core interfaces like
   Comparator, Runnable, Callable and MethodHandle.
- invokedynamic bootstrap method ease the implementation of lazy
intialization.
- ClassValue ease the storage of Class metadata.
- invokedynamic bootstrap method constants ease the storage of callsite
metadata.

And I'm sure I forget some features.

The second benefit is performance.
Some performance improvement are already visible by example using metod
handle for
doing reflection is more efficient than using java.lang.reflect
especially when you deal with primitive types.
Some improvement are yet to appear like inlining of the whole callsite
target tree.
Some will come later just because they require more works

About performance of the dynamic language runtimes, because JSR 292
simplifies
the implementation of the runtime, I expect that existing runtime will
introduce
more optimizations just because the code is now simpler. PHP.reboot is
all about that.

> -- John

Rémi


OpenJDK doc snapshot for Public Review - java.lang.invoke
RednaxelaFX 2011-02-21
嗷,Python 3.2发布了! http://bit.ly/9wJ4dg

PyPy's Approach to Implementing Dynamic Languages Using a Tracing JIT Compiler

========================

Charles Nutter说:
引用
@headius: Maxine VM appears to be only 2.5x slower than Hotspot client mode now, and still 99.9% written in Java: https://gist.github.com/836314


========================

Cog Blog: Build me a JIT as fast as you can…
这篇东西有趣,Squeak写的JIT

========================

Kragen Javier Sitaker: bytecode interpreters for tiny computers
如何把解释器写得精小,嘿嘿

========================

Twitter基于REE写的新GC
Building a Faster Ruby Garbage Collector

========================

Hardware prefetch in newer AMD processors performs better than HotSpot's software prefetch
相关的文件是这个,src/cpu/x86/vm/vm_version_x86.cpp
@@ -434,10 +434,17 @@
     if (supports_lzcnt()) {
       if (FLAG_IS_DEFAULT(UseCountLeadingZerosInstruction)) {
         UseCountLeadingZerosInstruction = true;
       }
     }
+
+    // On family 21 processors default is no sw prefetch
+    if ( cpu_family() == 21 ) {
+      if (FLAG_IS_DEFAULT(AllocatePrefetchStyle)) {
+        AllocatePrefetchStyle = 0;
+      }
+    }
   }
 
   if( is_intel() ) { // Intel cpus specific settings
     if( FLAG_IS_DEFAULT(UseStoreImmI16) ) {
       UseStoreImmI16 = false; // don't use it on Intel cpus

这个有意思…AMD处理器的family 21?覆盖面是哪些?

========================

新的反射相关patch。降低了同步开销来提升速度
Review Request -- CR6565585: Performance improvements to Method.invoke(), Contrstuctor.newInstance() and Field.getFieldAccessor()
引用
Method.invoke(), Contrstuctor.newInstance() and Field.getFieldAccessor() all have a needless
critical section, causing large slowdowns. This patch a replaces the synchronizations by volatile
references. Finally, the changes remove a doubled reference to another volatile variable.  This also
simplifies the generated code by commoning up the corresponding load instruction used in the fast
execution path.

Speedups from this change are uniformly 2x or better.

The proposed improvement and patch was originated by John Rose.

Thanks,

Mike


========================

OpenCore + OpenJDK = Zero Sum Game
这个监控工具貌似有趣
Lightweight Java Instrumentation and Profiling

========================

Oracle JDK/OpenJDK在ARM上的一些信息的链接
FOSDEMO 2010: The ARM Optimised Interpreter and Thumb2 JIT
OpenJDK ARM homepage?
How do you make Java fast? Answer: Go down the pub!
How do you make Java Fast? Answer: Go down the pub! Part 2
Fixing the OpenJDK ARM Support
Gary Benson: ARM interpreter
引用
I just discovered that the ARM-specific interpreter stuff that Ed Nevill wrote (and then abandoned) last year has a hack that disables it when run with -XX:+PrintCommandLineFlags. I guess this is one problem when you have 6,000 lines of assembler nobody understands: you don’t know what secret weird sh*t is buried in there.


=========================

http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2008-January.txt
引用
From Stephen.Bohne at Sun.COM  Wed Jan  2 08:02:21 2008
From: Stephen.Bohne at Sun.COM (Steve Bohne)
Date: Wed, 02 Jan 2008 11:02:21 -0500
Subject: Advice of how to implement ahead of time compilation.
In-Reply-To: <daa40b5d0712261042l7b0d9bc7pb690151b42f726db@mail.gmail.com>
References: <daa40b5d0712261042l7b0d9bc7pb690151b42f726db@mail.gmail.com>
Message-ID: <477BB58D.9080404@sun.com>

Hi Ram?n,

Ram?n Garc?a wrote:
> Hello,
>
> I am interested in implementing ahead of time compilation. Testing
> with Mono (a free .NET implementation) I see that ahead of time
> compilation greatly improves startup time, turning about 0.7 s of a
> compilation (equivalent to javac) into 0.08 s. Therefore, I think that
> it would be useful to have this option for interactive applications,
> that could be compiled to native during installation, and then
> executed directly. This would not harm the cross platform nature of
> Java, because applications would be still distributed in bytecode
> format.
>
> I need advice about what path should I follow.
>
> - One way would be to extend the current support of class data sharing
> to user supplied code, so that the virtual machine could load several
> data sharing archives, one for the system classes, and another for the
> current application.

First you'd have to extend class data sharing to cache compiled system
code, which it doesn't do.  (Maybe that was already implied in what you
wrote.)

FWIW, a while back we did some internal experiments to determine the
benefit of caching compiled code for client/desktop type applications.
At the time there wasn't much benefit.  The consensus explanation was
that the HotSpot client compiler is actually very fast to compile code,
and the initial interpreter + compilation overhead doesn't cost all that
much for real applications.

That is not to say that a well optimized AOT scheme couldn't have
benefit - just some anecdotal evidence.

>
> - An alternative method, the virtual machine would load a shared
> library containing the compiled application code. Tools must be
> provided to compile a collection of classes (and JARS) into a shared
> library, where all internal references are resolved.

It seems like the first option would be easier to port and maintain.
The class data sharing archive file format is the same across all
platforms we support, while shared libraries aren't.

Steve

>
> I see that unit of compilation must be the classloader, that is,
> classes intended to be loaded in the same classloader should be
> compiled together. Otherwise, it would be incorrect to compile a call
> to a class into a concrete call, because one cannot know if the class
> referenced will be overridden by some other class earlier in the
> classpath; and it seems that linking is an important part of compiling
> time. In fact, the JSR 294 of Java Superpackages subtitle says
> "Language extensions in support of information hiding and separate
> compilation". I understand that the mechanism of superpackages, by
> modifying the access rules, allow one class to be compiled knowing
> which concrete classes is referenced, and thus make it possible ahead
> of time compilation of the superpackage into a shared library.
>
> Ramon
RednaxelaFX 2011-02-24
JDK7 Milestone 12发布了,这个是传说中的preview版,官方说法是鼓励大家试用。不过据说license很诡异。嗯。

http://jdk7.java.net/preview/

Mark Reinhold的文:JDK 7 Developer Preview

==========

@bluedavy 写道
jdk的这个bug(6u23/6u24里当设置了java.io.tmpdir后,jps、jstack等就不能用了)还是挺要命的,为啥优先级只是low呢,郁闷,看来还是用openjdk自己compile的靠谱

Bug ID 7021676: jps not found the pid whit "-Djava.io.tmpdir"
这个bug在6u25/OpenJDK6b22貌似已经修复了:
Bug ID 7009828: Fix for 6938627 breaks visualvm monitoring when -Djava.io.tmpdir is defined

==========

EclipseCon 2011上Marcus Hirt的session值得期待:
HotRockit – What to Expect from Oracle’s Converged JVM
引用
Oracle is converging the HotSpot and JRockit JVMs to produce a "best of breed JVM". Internally the project is sometimes referred to as the HotRockit project. There is already a large influx of ideas and solutions provided by the JRockit JVM into the Open JDK.

Examples of improvements include:
·Better monitoring and profiling
·Improved performance
·Better ergonomics
This talk will discuss what to expect from the converged JVM over the next two years, and how this will benefit the Eclipse community.


============

John Rose对MethodHandle的性能的评论:
performance issue: 7023639: JSR 292 method handle invocation needs a fast path for compiled code
John Rose 写道
It took us years to learn to optimize calls to Java methods.  Now we are beginning to learn how to optimize uses of method handles.  This new learning curve won't take as long as the old one, since we can reuse our existing knowledge.  But it won't be instantaneous either.  I expect both JDK 7 and JDK 8 to include important performance improvements to method handles and invokedynamic.

To help us track our progress and manage our tuning work, we will (from time to time) file issues having to do with JSR 292 performance.

Issue 7023639 applies to people who are using method handles as function pointers for Java.  If this affects your use of 292, please let us know.

Other issues likely to arise have to do with performance disparities between method handles obtained from MethodHandles.Lookup and the corresponding bytecoded and reflective operations.

Onward!

-- John Rose

P.S. Here's the bug text:

7023639 JSR 292 method handle invocation needs a fast path for compiled code

This is a tracking issue for calls to non-constant method handles to and from "hot" compiled code.

Internally to the JVM, method handle argument list transformations are implemented on the interpreter stack.  This means that when compiled code invokes a method handle with argument transforms, it goes through a C2I adapter, transforms the argument list in interpreted format, and then (presumably) goes through an I2C adapter.

At least the most important transforms should go through customized code.  These include:
- direct access (no transforms)
- receiver binding (the bindTo transformation)
- trivial asType transformations
- invokeGeneric (argument and return value conversions to and from Object)

Probably all of the core transforms on MethodHandle virtual methods (not necessarily MethodHandles static methods) should get favorable treatment for compiled-to-compiled calls.

An important customer is Project Lambda, which should be using method handles in preference to anonymous classes.  Getting the above paths right for compiled code will enable this choice.

Note that this bug does not apply to users invokedynamic, since method handles at invokedynamic call sites are routinely inlined into optimized code.


Gary Benson正在给OpenJDK 7实现Zero版的invokedynamic支持:
JSR 292 and Zero
Global site tag (gtag.js) - Google Analytics