The i-Technology Media!
Register | Log in
   
 
.NET  ·  AJAX  ·  CLOUD  ·  ECLIPSE  ·  FLEX  ·  OPEN WEB  ·  iPHONE  ·  JAVA  ·  LINUX  ·  OPEN SOURCE  ·  ORACLE  ·  PBDJ  ·  SEARCH  ·  SILVERLIGHT  ·  SOA  ·  VIRTUALIZATION  ·  WEB 2.0  ·  WIRELESS  ·  XML
Comments
Drool, Britannia? Is the UK Failing the Cloud?
By Roger Strukhoff
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Jan. 8, 2012 11:38 AM EST
read more & respond »
Cloud Expo on Google News
Did you read today's front page stories & breaking news?

Cloud Expo & Virtualization 2011 West
Keynotes
Oracle
Opening Keynote | An Enterprise Cloud for Business-Critical Applications
Abiquo
Day 2 Keynote | The Enterprise Cloud Tightrope - Balancing for Success
Akamai
Day 3 Keynote | The DNA of an Enterprise Cloud
DIAMOND SPONSOR:
Oracle
Many Clouds, Many Choices'Cloud
PLATINUM PLUS SPONSORS:
Abiquo
Enterprise Cloud Best Practices - Town Hall - Join the discussion…
PLATINUM SPONSORS:
Intel
Progressing Toward the Federated, Automated and Client-Aware Cloud
New Relic
How to build an app with Twitter-like throughput
Rackspace
Computing in the Cloud Era
GOLD SPONSORS:
Gale Technologies
Practical Cloud Migration
IBM
Re-think IT. Re-inventing Business.
Intel/McAfee
Identity Driven Security in the Cloud
PerspecSys
Hackers Hackers Everywhere, Is My Public Cloud That Safe?
Red Hat
Unlock the Value of the Cloud
SHI
Mission Critical Applications and the Cloud - Myth or Reality?
SoftLayer
Not Your Grandpa's Cloud
Terremark
Integrating Enterprise Clouds
VMware
Upgrade to a vCloud
POWER PANELS:
Cloud Expo Silicon Valley: CTO Power Panel
Cloud Expo Silicon Valley: CEO Power Panel
Cloud Expo Silicon Valley: Cloud SuperStars Panel
Cloud Expo Silicon Valley: CloudNOW Panel
Click For 2010 West
Event Webcasts
Cloud Expo & Virtualization 2011 East
DIAMOND SPONSOR:
Dell
Dell & VMware Deliver the Enterprise Hybrid Cloud
PLATINUM PLUS SPONSORS:
Abiquo
Are Financial Services Organizations Risking Security by Avoiding Cloud Computing?
Oracle
From Consolidation to Enterprise Private PaaS
PLATINUM SPONSORS:
Intel
Driving the Transformation to Next Generation Cloud Data Centers
Rackspace
The Inevitability of an Open Cloud
GOLD SPONSORS:
CA Technologies
Follow YOUR path to Cloud Computing
Interxion
Who Keeps the Cloud in the Air?
Microsoft
Patterns for Cloud Computing
PerspecSys
War in the Clouds: Are you ready?
ServiceMesh
The Big Win: Stop Playing Small-Ball with Your Cloud Strategy
Terremark
Evaluating Enterprise Clouds
Xiotech
Cloud Storage: Myths and Realities
POWER PANELS:
Cloud Expo New York: CTO Power Panel
Cloud Expo New York: CEO Power Panel
Cloud Expo New York: CMO Power Panel
Cloud Expo New York: Wrap-Up Power Panel
Click For 2010 West
Event Webcasts
Live Google News by SYS-CON!
Top Three Links You Must Click On


Features
Is Your JIT Telling You Lies?
Even the most sophisticated developer should avoid using micro-benchmarks as predictors of application performance

By: Paul Murray
Dec. 15, 2009 10:15 AM

Java Development on Ulitzer

Writing meaningful Java benchmarks is a tricky business. It's well known that the Java Virtual Machine's just in time (JIT) compilation process means that running an application for a few seconds won't let you predict the performance of the application over hours or days of uptime. In spite of this, developers often rely on micro-benchmarks to set performance SLAs for their applications.

Micro-benchmarks test some small, discrete component of an application. They're usually written in an effort to benchmark a component considered critical to the app's overall performance. Here's a typical example, summing all the numbers from one to a specified limit:

long accumulatedTotal(int limit) {
long result=0L;
for (int i=1; i<=limit; i++) {
result += i;
}
return result;
}

How long does this method take to execute for different values of limit? On my 32-bit OpenSolaris machine using Java 6u16 the first call to this method shows a non-linear runtime for small values of limit:

The method is initially executed by an interpreter and the JIT compiler doesn't kick in until the JVM detects that the method is doing a lot of looping. It then takes some time to compile the method, during which time the loop continues to run under the interpreter. Once this process is complete, the JVM can jump into the compiled version of the method and the average runtime for the remaining iterations is greatly reduced. The upshot is that for small values of limit, the runtime is non-linear, but as limit gets much larger it becomes much more predictable.

Benchmarks like SPECjvm2008 know about this quirk in the behavior of the JVM and try to account for it. SPECjvm2008 repeatedly runs its benchmark code for a fixed "warm-up" period - typically 120 seconds. The warm-up time must be long enough to allow compilation of all the performance-critical methods in the benchmark code. You can observe the compilation process in the JVM by adding the Java command-line option -XX:+PrintCompilation. The benchmark then continues to execute for a "run" period - for SPECjvm2008 the default duration is 240 seconds - and the benchmark result is the number of iterations of the benchmark that were completed per minute of the run period.

But does this approach really predict application performance? As so often with the JVM, it turns out to be more complicated than you expect.

A modern JVM such as Sun's Java 6 contains a JIT compiler that is capable of generating native code every bit as efficient as that produced by an ahead-of-time (AOT) C++ compiler. Among the many optimizations performed by the JIT is method inlining. Calling small methods can incur a large overhead - for example, Java classes routinely contain getter and setter methods that simply read or write the value of a variable. The cost of calling a trivial method can be high relative to the work performed by the method, so the JIT will attempt to copy the body of a small method into its caller.

Consider a simple getter method like this:

Class Limit {
private int limit;
public Limit(int limit) {
this.limit = limit;
}
public int getLimit() {
return limit;
}
}

If this is called from a loop like this, where "o" is an instance of class Limit:

for (int i=0; i<o.getLimit(); i++) {
// Do something
}
then it would be nice to inline the getLimit() method so that the resulting code looks like this:
for (int i=0; i<o.limit; i++) {
// Do something
}

In an AOT compiler, this inlining might not be possible. For example, the object "o" might not be an instance of Limit but of some subclass of Limit that overrides the getLimit() method. Because it compiles with knowledge of runtime behavior and can throw away compiled code if necessary, the JIT can decide to inline Limit.getLimit(). If "o" is ever seen to be an instance of a subclass of Limit, then the compiled code can be discarded and the loop recompiled appropriately.

Let's take a look at the Arrays.sort(Object[] a) method. The javadocs say that all elements in the array must implement the Comparable interface. Arrays.sort() make many calls to the compareTo() method of the elements it is sorting. The compareTo() method in class Integer is very simple:

public int compareTo(Integer anotherInteger) {
int thisVal = this.value;
int anotherVal = anotherInteger.value;
return (thisVal<anotherVal ? -1 : (thisVal==anotherVal ? 0 : 1));
}

If you only ever call Arrays.sort() on arrays consisting entirely of Integers, the JIT compiler can detect this and inline the Integer.compareTo() method into the sort().

What happens if you later call Arrays.sort() to sort an array of instances of class Short? The compiled code performs a quick check on each array element to ensure that it is an Integer. When this check fails, the compiled code is discarded and Arrays.sort() is recompiled to account for the new set of observed array element types.

The updated compiled code will now inline both Integer.compareTo() and Short.compareTo(). The decision about which inlined method to execute will be taken by explicitly checking whether the array element is of class Integer or Short. Even with two different implementors of the Comparable interface, the JIT compiler is still able to perform method inlining.

Let's go even further and call Arrays.sort() on an array of instances of class Byte. If the JIT compiler now inlines Byte.compareTo() as well, we'll have three different inlined versions of this method in the Arrays.sort() code. This is all starting to get out of hand!

Now the JIT compiler changes strategy. It throws away all its previous inlining of the compareTo() method and does a traditional vtable dispatch just like a static compiler would do. For very small methods such as Integer.compareTo(), this can have a significant performance impact as the time spent calling and returning from such a tiny method may be far greater than the time spent executing its code.

The limitation that only two receivers of a virtual method dispatch can be inlined is known as the bimorphic inlining policy and is a poorly understood but significant limitation on JIT performance.

Listing 1 contains a benchmark program that shows this effect in action.

The test code calls Arrays.sort() five times. First it calls it on an array of Byte objects. This is simply a "warm up" and the results are ignored. It then calls Arrays.sort() on an array of Bytes, then on an array of Shorts, then an array of Integers, and finally Arrays.sort() is called for a second time on an array of Bytes.

Here's the result of running this benchmark using 32-bit Java 6u16 on my OpenSolaris machine. For this test the arrays to be sorted have 100,000 elements:

Java SortBench 100000
(WARMUP) BYTE SORT PERFORMANCE = 1778 operations per minute
(FIRST PASS) BYTE SORT PERFORMANCE = 2034 operations per minute
SHORT SORT PERFORMANCE = 1623 operations per minute
INTEGER SORT PERFORMANCE = 1261 operations per minute
(SECOND PASS) BYTE SORT PERFORMANCE = 1307 operations per minute

As you can see, the warm-up performance is - as expected - a bit lower than for the first pass of the byte sort test. However the performance of subsequent sort tests falls away quite sharply and the performance of the second byte sort test is only 64% of the performance seen first time round.

For complex server-side applications where it's likely that Arrays.sort() will be called on by many different classes, the real performance when sorting 100,000 random byte instances is likely to be about 1,307 operations per minute, not 2,034 operations per minute as the "standard" benchmarking approach suggests.

This limitation is also important in attempting to benchmark Scala code. Scalac - the Scala compiler - rewrites Scala code blocks as standard Java classes that implement a special Scala interface. This interface requires a method called apply() that executes the code block. As you create code blocks in Scala, you are actually creating tiny Java classes that implement an interface. This is the same as the scenario when calling Arrays.sort() with different implementors of Comparable.

Take a look at the Scala benchmark in Listing 2. The benchmark repeatedly sums all the numbers from 1 to 1 million for a specified period of time and then prints out the number of iterations completed and the accumulated result.

Notice that the same code block is used for all executions of the benchmark. This looks more straightforward than the Java case seen previously. But here's the output from this program:

scala Test
WARM-UP RUN : 7628 operations per minute
FIRST RUN : 7713 operations per minute
SECOND RUN : 6185 operations per minute
THIRD RUN : 6134 operations per minute

Even though the code block is identical in all four cases, each code block is handled inside its own unique class. Once again the effect of the bimorphic inlining policy in the JIT compiler is to generate more performant code for the runBench method when only one or two implementors of the interface have been seen.

The bimorphic inlining policy is a great example of why micro-benchmarking isn't just hard in Java - it's all but impossible. Even if you know about this issue and work around it, the probability is that there will be something else in the JVM to trip you up and give you phony results. Generating masses of micro-benchmark numbers can be worse than useless - they give the erroneous impression of scientific accuracy that can lead to performance SLAs that can't be met in production.

Bottom line - don't use micro-benchmarks as predictors of application performance. Run your app for real and profile, then profile again, and then profile some more!

Published Dec. 15, 2009— Reads 5,455
Copyright © 2009 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
About Paul Murray
Paul Murray is a freelance specialist in Java performance analysis. He was previously a Technology Fellow at a leading Wall Street Investment Bank where he provided Java performance and scalability consultancy to several thousand Java developers. Paul was also a Member of Technical Staff at Silicon Graphics, working on dynamic compiler implementation and porting Sun's JRE to the SGI IRIX platform.

Add Your Feedback

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers

ADS BY GOOGLE

Breaking Java News
Teamsters' Drive Up Standards Bus Campaign Reaches Milestone 300th Win
Chrysler Group's 'It's Halftime in America' Super Bowl Video Earns Second Place in YouTube's Ad Blitz 2012 Contest
U.S. Census Bureau Daily Feature for February 18
U.S. Census Bureau Black History Month Feature for February 18
Following Is a Test Release
President of the Independent Libya Foundation Announces Vision Statement for Libya on the Anniversary of the Libyan Revolution

ADVERTISE   |   MAGAZINE SUBSCRIPTIONS   |   FREE BREAKING-NEWSLETTERS!   |   SYS-CON.TV   |   BLOG-N-PLAY!   |   WEBCAST   |   EDUCATION   |   RESEARCH

.NET Developer's Journal - .NETDJ   |   ColdFusion Developer's Journal - CFDJ   |   Eclipse Developer's Journal - EDJ   |   Enterprise Open Source Magazine - EOS
Open Web Developer's Journal - OPENWEB   |   iPhone Developer's Journal - iPHONE   |   Virtualization - Virtualization   |   Java Developer's Journal - JDJ   |   Linux.SYS-CON.com
PowerBuilder Developer's Journal - PBDJ   |   SEO / SEM Journal - SJ   |   SOAWorld Magazine - SOAWM   |   IT Solutions Guide - ITSG   |   Symbian Developer's Journal - SDJ
WebLogic Developer's Journal - WLDJ   |   WebSphere Journal - WJ   |   Wireless Business & Technology - WBT   |   XML-Journal - XMLJ   |   Internet Video - iTV
Flex Developer's Journal - Flex   |   AJAXWorld Magazine - AWM   |   Silverlight Developer's Journal - SLDJ   |   PHP.SYS-CON.com   |   Web 2.0 Journal - WEB2
Apache   |   CMS   |   CRM   |   HP   |   Oracle Journal   |   Perl   |   Python   |   Red Hat   |   Ruby on Rails   |   SAP   |   SaaS

SYS-CON MEDIA:   ABOUT US   |   CONTACT US   |   COMPANY NEWS   |   CAREERS   |   SITE MAP
SYS-CON EVENTS:   |  AJAXWorld Conference & Expo  |  iPhone Developer Summit  |  Cloud Computing Conference & Expo  |  SOA World Conference & Expo  |  Virtualization Conference & Expo
INTERNATIONAL SITES:   India  |  U.K.  |  Canada  |  Germany  |  France  |  Australia  |  Italy  |  Spain  |  Netherlands  |  Brazil  |  Belgium
 Terms of Use & Our Privacy Statement     About Newsfeeds / Video Feeds
Copyright ©1994-2008 SYS-CON Publications, Inc. All Rights Reserved. All marks are trademarks of SYS-CON Media.
Reproduction in whole or in part in any form or medium without express written permission of SYS-CON Publications, Inc. is prohibited.
 
close this window