Tuning
One of the most important steps to achieving optimal performance from Java applications is tuning.
Tuning strategies can be applied to either the application code itself or to the underlying JVM parameters.
This section outlines some of the techniques.
Tuning application code
It is important to avoid premature optimization when developing Java applications. During the
development cycle, you should use profiling tools to determine hotspots in the application code. You can
then tweak these sections of the code to deliver the best application performance.
Assumptions about what parts of the application are hot or cold might not be true in practice. After an
application has been profiled, you should then separate the hot parts of the code (into their own methods)
from the cold parts. Microbenchmarks are generally not representative of performance on actual
workloads. It is better to measure the performance of the applications on real workloads, identify the
bottlenecks in the application code and then tune these parts of the application.
A variety of tools exist to profile Java applications. For profiling the entire system, including Java
applications, IBM has developed an Eclipse-based toolkit called IBM Visual Performance Analyzer (VPA)
[16] that you can use for profiling applications to identify hotspots or performance problems. You can use
VPA to visualize entire system profiles that are collected by using TPROF [20] or the Linux® operating-
system profile tool called oprofile [19]. Profiling tools such as the Eclipse Test and Performance Tools
Platform (TPTP) [6] and the IBM Rational® Application Developer for IBM WebSphere® Software [14],
allow for profiling Java applications. The TPTP toolkit is an Eclipse platform that allows for profiling Java
applications and the IBM rapid application development (RAD) tools help with analysis and visualization
of running Java applications.
Tuning JVM parameters
The following sections briefly discuss some knobs available for tuning the garbage collector and options
available when reducing application startup time is important. Finally, some techniques and tools
available for diagnosing performance problems are discussed.
Tuning the garbage collector
Because the garbage collector (GC) can have a big impact on the performance of Java applications, it is
important to tune the garbage collector, based on the characteristics of the application. Choosing the right
parameters, garbage-collection policy and optimal heap settings can mean the difference between good
and poor performance of the application.
In response to a variety of performance requirements from client applications and benchmarks, IBM
SDKs, Java Technology Edition, Version 5.0 and Version 6.0 have implemented a number of high-
performance garbage collectors to provide a broad selection of approaches and strategies for garbage
collection. The objective of the framework is to give clients the flexibility of selecting a garbage collector
IBM Just-In-Time Compiler (JIT) for Java
Best practices and coding guidelines for improving performance
suitable for their applications in a given environment. Garbage collection policies are optimized for
different application scenarios and a particular garbage collection policy [10] can be chosen depending on
the characteristic of the application.
The size of the Java heap has a direct impact on the performance of Java programs. Specifying too small
a Java heap can result in excessive garbage collection activities, which results in poor application
performance. The IBM SDK provides command-line options to both shape the Java heap and tune its
size for performance. The –verbose:gc and –Xverbosegclog [13] JVM options can be used to request
detailed information about garbage collection activities as well as the memory footprint information of the
application. An appropriate Java heap size can usually be derived from analyzing the frequency of
garbage collection, its duration, and the Java heap occupation information. Then, with Java heap size
information derived using these techniques, the –Xmx and –Xms options can be used to specify the
maximum and minimum Java heap sizes, respectively. For 64-bit applications that do not need very large
heap sizes (25 GB or less), the IBM SDK for Java 6 can make use of compressed references to reduce
the size of objects, thereby making efficient use of the Java heap. This feature can be used for 64-bit
Java applications where keeping memory footprint low is important. The compressed references feature
can be enabled using the –Xcompressedrefs option in 64-bit JVMs [12].
To analyze the garbage-collection behavior of the IBM JDK when running the Java application, you can
use the Garbage Collection and Memory Visualizer (GCMV) tool [4] developed by IBM. The tool helps
visualize the application’s memory-usage pattern, detects memory leaks during execution, and also helps
in tuning the various parameters of the garbage collector to improve application performance. Another
tool for discovering possible memory leaks in an application is the HeapAnalyzer [9].
Tuning JVM parameters for application startup
If the startup performance of a Java application is important, you can minimize the startup time by using
the IBM quickstart and class sharing technologies. The nonstandard –Xquickstart option reduces the
initial compilation of methods to a lower compilation level, as compared to the default mode [11]. Although
performing quicker compilations for more methods can improve application startup, it can also degrade
the performance of long-running applications that contain hot methods. As a result, you should only use
this JVM option for applications where initial startup speed is more important than the long-running
throughput.
IBM SDK for Java 5 introduced class-sharing technology, which saves class data from the current
invocation of an application into a persistent cache [17]. Subsequent invocations of the application can
simply reload the data from the persistent cache, thereby reducing the virtual memory footprint and
improving startup time. In IBM SDK for Java 6, this technology was extended to also save Ahead-Of-Time
(AOT) compilations of methods. Subsequent iterations can reload these AOT compilations directly from
the cache, effectively gaining the benefit of compiled code without incurring compilation costs. The JVM
option to enable class-sharing technology is –Xshareclasses.
Furthermore, application startup time can also be improved by recompiling the application classes with
the Java compiler that is provided with the IBM JDK. In particular, compile-time inlining of JSR bytecodes
[2] and the generation of stack maps (in Java 6.0) reduces the time that is taken to load classes. You can
also reduce memory usage and startup time by directing the compiler to produce only the required
IBM Just-In-Time Compiler (JIT) for Java
Best practices and coding guidelines for improving performance
debugging information. For example, if you deploy an application with no need for Java-debug support,
then you can omit the local-variable tables.
Diagnostic techniques and tools available for performance problems
Even after tuning the important parts of the application code, the Java application may still suffer from
poor performance. This may be due to system characteristics or some uncharacteristic behavior of the
underlying JVM runtime environment. Given that the JIT compiler and the garbage collector have the
maximum impact on the performance of applications, we discuss some types of diagnostic information
that can be gathered to help with the diagnosis of application performance problems.
As mentioned in the previous section, –verbose:gc and –Xverbosegclog [13] JVM options can be used
to request detailed information about GC activities. This information will help determine the GC overhead
costs and potential GC tuning opportunities. A heapdump which is a snapshot of the Java heap at any
given point during the execution of the Java application can also be obtained as described in the
diagnostic guides [12].
The TRJIT compiler provides a command-line option –Xjit:verbose. With this option, the TRJIT compiler
generates detailed information about which application methods were compiled and determined by the
TRJIT compiler to be executed frequently. This information can help in diagnosing potential bottlenecks in
the Java application.
At any point during the runtime execution of the Java application, a javacore can be triggered by sending
a signal to the JVM. The javacore provides various diagnostic information including operating system
details, application threads, locks and memory (including the heap, JIT compiler and JVM itself) usage by
the JVM. This information can be used to detect the root cause of hangs, deadlocks or resource
contention in the system. The IBM Thread and Monitor Dump Analyzer for Java [15] is a tool that can be
used to analyze javacores produced by the IBM SDK.
Summary
Using good and simple design principles to write Java applications is better than trying prematurely to
make such applications run faster through performance-tweaking methods. However, by following the
appropriate guidelines mentioned in this article, and carefully tuning the application code, it is possible to
get the best performance out of the application running with the IBM Java Development Kit 6.0.
IBM Just-In-Time Compiler (JIT) for Java
Best practices and coding guidelines for improving performance
Footnotes
[1].
M. Arnold. Online Profiling and Feedback-Directed Optimization of Java, PhD thesis, Rutgers
University, October 2002.
[2].
C. Artho, A. Biere. Subroutine Inlining and Bytecode Abstraction to Simplify Static and Dynamic
Analysis. In Proceedings of the First Workshop on Bytecode Semantics, Verification, Analysis
and Transformation (Bytecode), December 2005.
[3].
J.D.Choi et al. Escape Analysis for Java. In Proceedings of the Conference on Object-Oriented
Programming Systems, Languages, and Applications (OOPSLA), November 1999.
[4].
H.Cummins. Garbage collection with the IBM Monitoring and Diagnostic Tools for Java - Garbage
Collection and Memory Visualizer. IBM developerWorks
®
, 2007.
ibm.com/developerworks/java/library/j-ibmtools2/index.html?S_TACT=105AGX02&S_CMP=EDU
[5].
J.Dean, D.Grove, C.Chambers. Optimization of Object-Oriented Programs Using Static Class
Hierarchy Analysis. In Proceedings of the European Conference on Object-Oriented
Programming, August 1995.
[6].
Eclipse Test and Performance Tools Platform. www.eclipse.org/tptp
[7].
U.Hölzle, D.Ungar. A third-generation self implementation: Reconciling responsiveness with
performance. In Proceedings of the Conference on Object-Oriented Programming Systems,
Languages, and Applications (OOPSLA), October 1994.
[8].
U.Hölzle, C.Chambers, D.Ungar. Optimizing Dynamically-Typed Object-Oriented Programming
Languages with Polymorphic Inline Caches. In Proceedings of the European Conference on
Object-Oriented Programming, July 1991.
[9].
IBM HeapAnalyzer http://www.alphaworks.ibm.com/tech/heapanalyzer
[10].
IBM Java Development Kit 6.0 command line options.
http://publib.boulder.ibm.com/infocenter/javasdk/v6r0/index.jsp?topic=/com.ibm.java.doc.diagnost
ics.60/html/cmdline.html
[11].
IBM Java Diagnostics Guide 5.0. Performance of short-running applications.
http://publib.boulder.ibm.com/infocenter/javasdk/v5r0/topic/com.ibm.java.doc.diagnostics.50/diag/
tools/jitpd_short_run.html
[12].
IBM Java Diagnostic Guide. ibm.com/developerworks/java/jdk/diagnosis
[13].
IBM Java Diagnostics Guide: IBM Garbage Collection and Storage Allocation Techniques. Article
available at ibm.com/developerworks/java/jdk/diagnosis
[14].
IBM Rational Application Developer for WebSphere Software.
ibm.com/software/awdtools/developer/application
[15].
IBM Thread and Monitor Dump Analyzer for Java http://www.alphaworks.ibm.com/tech/jca
[16].
IBM Visual Performance Analyzer. www.alphaworks.ibm.com/tech/vpa
[17].
Java Technology, IBM Style: Class Sharing. IBM developerWorks
ibm.com/developerworks/java/library/j-ibmjava4
[18].
M.Kawahito et al. A New Idiom Recognition Framework for Exploiting Hardware-Assist
Instructions. In Proceedings of the Twelfth International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS), October 2006.
[19].
OProfile. A System Profiler for Linux. http://oprofile.sourceforge.net
IBM Just-In-Time Compiler (JIT) for Java
Best practices and coding guidelines for improving performance
[20]. Performance
Inspector.
http://perfinsp.sourceforge.net
[21].
V.Sundaresan et al. Experiences with Multithreading and Dynamic Class Loading in a Java Just-
In-Time Compiler. In Proceedings of Code Generation and Optimization (CGO), March 2006.
Resources
These Web sites provide useful references to supplement the information contained in this document:
•
IBM System p® Information Center
http://publib.boulder.ibm.com/infocenter/pseries/index.jsp
•
System p on IBM PartnerWorld®
ibm.com/partnerworld/systems/p
•
System z on IBM PartnerWorld®
ibm.com/partnerworld/systems/z
•
IBM AIX® on IBM PartnerWorld®
ibm.com/partnerworld/aix
•
IBM z/OS® on IBM PartnerWorld®
ibm.com/partnerworld/zos
•
Linux on System z on IBM PartnerWorld®
ibm.com/partnerworld/systemz/linux
•
IBM Publications Center
www.elink.ibmlink.ibm.com/public/applications/publications/cgibin/pbi.cgi?CTY=US
•
IBM Redbooks® publications
ibm.com/redbooks
•
IBM developerWorks® site
ibm.com/developerworks
IBM Just-In-Time Compiler (JIT) for Java
Best practices and coding guidelines for improving performance
About the authors
Pramod Ramarao holds Bachelor of Engineering (B.E.) and Master of Science (M.S.) degrees in
Computer Engineering. He joined IBM in 2003 and has since been working on the optimizer component
of the IBM TR JIT compiler.
Joran Siu obtained his Bachelor of Science degree in Electrical and Computer Engineering from Cornell
University. Since joining IBM, he has been contributing to the IBM TR JIT compiler, with a focus on
performance on the IBM System z platform.
Pavan Pamula holds Bachelor of Engineering (B.E.) in Electronics and Communications Engineering and
Master of Science (M.S.) degrees in Software Engineering. He is currently a technical lead for the IBM
software development kit (SDK) for Java enablement team. He leads IBM Systems family and
independent software vendor (ISV) initiatives with IBM SDK for Java development and support teams.
IBM Just-In-Time Compiler (JIT) for Java
Best practices and coding guidelines for improving performance
Trademarks and special notices
© Copyright IBM Corporation 2008.
References in this document to IBM products or services do not imply that IBM intends to make them
available in every country.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked
terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these
symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information
was published. Such trademarks may also be registered or common law trademarks in other countries. A
current list of IBM trademarks is available on the Web at "Copyright and trademark information" at
www.ibm.com/legal/copytrade.shtml.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other
countries, or both.
Intel, Intel Inside (logos), MMX, and Pentium are trademarks of Intel Corporation in the United States,
other countries, or both.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
Information is provided "AS IS" without warranty of any kind.
This information could include technical inaccuracies or typographical errors. Changes are periodically
made to the information herein; these changes will be incorporated in new editions of the publication. IBM
may make improvements and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled
environment. The actual throughput or performance that any user will experience will vary depending
upon considerations such as the amount of multiprogramming in the user's job stream, the I/O
configuration, the storage configuration, and the workload processed. Therefore, no assurance can be
given that an individual user will achieve throughput or performance improvements equivalent to the
ratios stated here.
Neither International Business Machines Corporation nor any of its affiliates assume any responsibility or
liability in respect of any results obtained by implementing any recommendations contained in this article.
Implementation of any such recommendations is entirely at the implementor’s risk.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in
any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part
of the materials for this IBM product and use of those Web sites is at your own risk.
IBM Just-In-Time Compiler (JIT) for Java
Best practices and coding guidelines for improving performance
Document Outline - Abstract
- Introduction
- JIT technology
- Compilation happens during run time
- Strategy: Focus optimization effort on important code
- Inlining
- Strategy: Aggressively inline method calls
- Virtual and interface methods
- Strategy: Devirtualize as many call sites as possible
- Heap allocations
- Strategy: Optimize for object locality
- Java coding guidelines
- Object allocations
- Guideline: Avoid creating objects inside loops
- Guideline: Minimize object allocations, if possible
- Guideline: Use immutable fields
- Methods
- Guideline: Keep methods small
-
- Guideline: Use exceptions and reflection rarely
- Loops
- Guideline: Do not modify the loop bounds within the loop body
- Guideline: Increment the loop index by a single value across all paths
- Guideline: Make loops as compact as possible
- Guideline: Use locals instead of fields or static variables, where possible
-
- Guideline: System.arraycopy() is well-optimized by the JIT
- Guideline: TRJIT can also optimize special loops such as memset and array translate
- Synchronization
- Tuning
-
- Tuning application code
-
- Tuning JVM parameters
-
- Tuning the garbage collector
- Tuning JVM parameters for application startup
- Diagnostic techniques and tools available for performance problems
-
- Summary
- Footnotes
- Resources
- About the authors
- Trademarks and special notices
Dostları ilə paylaş: |